I have to create the following data in Stata several times (say 50,000) -- one below the other. There need to be two variables: (1) a counter going from 1 to 500; and (2) a string variable that is A for the first 25 observations and then B for observations 26 to 500. Then this process needs to be repeated 50,000 times so that there are a total of 50,000*500 rows.

So far the way I have done this is by creating a csv file with the 500 rows and 2 variables and then reading and appending it. It is a really slow and inefficient way of doing this. How can I do this within Stata?


I'm still inclined to consider this question off-topic. It looks like a simple code request. I'll answer with the hope that future questions more clearly state what the programming problem is (including code).

One way is:

set more off

// change to 500
set obs 15

gen counter = _n

// change numbers
gen strvar = "A" in 1/5
replace strvar = "B" in 6/15

// change to 50,000
expand 3

bysort counter : gen seq = _n
sort seq counter

// list
list, sep(15)

That's an example, so you need to adjust the numbers. The key command is expand.


