Libsvm can read the following datafile and convert it into sparse data structure in matlab (using libsvmread).

```
-1 3:1 11:1 14:1 19:1 39:1 42:1 55:1 64:1 67:1 73:1 75:1 76:1 80:1 83:1
-1 3:1 6:1 17:1 27:1 35:1 40:1 57:1 63:1 69:1 73:1 74:1 76:1 81:1 103:1
```

First column is a label for the binary classification and other columns are feature vectors. For example in first column only positions 3,11,14,19... are non zero.

I have a file in which these positions are not sorted. For example it could be like -

```
-1 11:1 3:1 14:1 19:1 39:1 42:1 55:1 64:1 67:1 73:1 75:1 76:1 80:1 83:1
```

Libsvmread won't work in such a situation. Is there anyway, where I can sort the data (according to positions) or is there any exisiting code which can help me extract this data in matlab?

The goal is that given this sample Input

```
-1 11:1 3:1 14:1 19:1 39:1 42:1 55:1 64:1 67:1 73:1 75:1 76:1 80:1 83:1
-1 3:1 2:1 6:1 4:1 17:1 27:1 35:1 40:1 57:1 63:1 69:1 73:1 74:1 76:1 81:1 103:1
```

We get the following output:

```
-1 3:1 11:1 14:1 19:1 39:1 42:1 55:1 64:1 67:1 73:1 75:1 76:1 80:1 83:1
-1 2:1 3:1 4:1 6:1 17:1 27:1 35:1 40:1 57:1 63:1 69:1 73:1 74:1 76:1 81:1 103:1
```

Answer:

Store all the info in an array `a[]`

and then sort using indices:

```
awk '{delete a
for (i=2; i<=NF; i++)
a[$i+0]=$i
n=asorti(a, sorted, "@ind_num_asc")
printf "%s%s", $1, OFS
for (i=1;i<=n;i++)
printf "%s%s", a[sorted[i]], (i==n?ORS:OFS)}' file
```

This uses asorti() and `@ind_num_asc`

to define the ordering mode.

For every line, we store in an array `a[]`

all the data starting on the 2nd field. Then, we sort it numerically and print it back, in the sorted order.

`delete a`

remove the array, so that we just append data from this line.`for (i=2; i<=NF; i++) a[$i+0]=$i`

store each field as an element in the array. By saying`$i+0`

we convert`xx:yy`

into just`xx`

, so that the indices will just be the left part of the field.`n=asorti(a, sorted, "@ind_num_asc")`

sort the array using its indices and store it in`sorted[]`

array. By saying`@ind_num_asc`

we tell`asorti`

to use indices, numerically and ascendent order.`printf "%s%s", $1, OFS`

print the first field, the one that is alone.`for (i=1;i<=n;i++) printf "%s%s", a[sorted[i]], (i==n?ORS:OFS)`

loop through the sorted values and print them.

```
$ awk '{delete a; for (i=2; i<=NF; i++) {a[$i+0]=$i}; n=asorti(a, sorted, "@ind_num_asc"); printf "%s%s", $1, OFS; for (i=1;i<=n;i++) printf "%s%s", a[sorted[i]], (i==n?ORS:OFS)}' a
1 3:1 11:1 14:1 19:1 39:1 42:1 55:1 64:1 67:1 73:1 75:1 76:1 80:1 83:1
-1 2:1 3:1 4:1 6:1 17:1 27:1 35:1 40:1 57:1 63:1 69:1 73:1 74:1 76:1 81:1 103:1
```

