I'm new to Docker so, most likely, I'm missing something.

I'm running a container with Elasticsearch, using this image.

I'm able to setup everyhing correctly. After that I was a using a script developed by a collegue in order to insert some data, basically querying a MySQL database and making HTTP requests .

Problem is, many of those requests get stuck until it fails. If I do netstat -tn | grep 9200 I get:

tcp6       0      0 ::1:58436               ::1:9200                TIME_WAIT  
tcp6       0      0 ::1:59274               ::1:9200                TIME_WAIT 

tcp6       0      0 ::1:58436               ::1:9200                TIME_WAIT  
tcp6       0      0 ::1:59274               ::1:9200                TIME_WAIT 

with a lot of requests. At this point I'm not sure if it's something related to elastic search or docker. This does not happen if Elasticsearch is instaleld on my machine.

Some info:

$ docker version
Client version: 1.6.2
Client API version: 1.18
Go version (client): go1.4.2
Git commit (client): 7c8fca2
OS/Arch (client): linux/amd64
Server version: 1.6.2
Server API version: 1.18
Go version (server): go1.4.2
Git commit (server): 7c8fca2
OS/Arch (server): linux/amd64

$ docker info
Containers: 6
Images: 103
Storage Driver: devicemapper
 Pool Name: docker-252:1-9188072-pool
 Pool Blocksize: 65.54 kB
 Backing Filesystem: extfs
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 4.255 GB
 Data Space Total: 107.4 GB
 Data Space Available: 103.1 GB
 Metadata Space Used: 6.758 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.141 GB
 Udev Sync Supported: false
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.82-git (2013-10-04)
Execution Driver: native-0.2
Kernel Version: 3.14.22-031422-generic
Operating System: Ubuntu 14.04.2 LTS
CPUs: 4
Total Memory: 15.37 GiB

$ docker logs elasticsearch
[2015-06-15 09:10:33,761][INFO ][node                     ] [Energizer] version[1.6.0], pid[1], build[cdd3ac4/2015-06-09T13:36:34Z]
[2015-06-15 09:10:33,762][INFO ][node                     ] [Energizer] initializing ...
[2015-06-15 09:10:33,766][INFO ][plugins                  ] [Energizer] loaded [], sites []
[2015-06-15 09:10:33,792][INFO ][env                      ] [Energizer] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/mapper/ubuntu--vg-root)]], net usable_space [145.3gb], net total_space [204.3gb], types [ext4]
[2015-06-15 09:10:35,516][INFO ][node                     ] [Energizer] initialized
[2015-06-15 09:10:35,516][INFO ][node                     ] [Energizer] starting ...
[2015-06-15 09:10:35,642][INFO ][transport                ] [Energizer] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/]}
[2015-06-15 09:10:35,657][INFO ][discovery                ] [Energizer] elasticsearch/Y1zfiri4QO21zRhcI-bTXA
[2015-06-15 09:10:39,426][INFO ][cluster.service          ] [Energizer] new_master [Energizer][Y1zfiri4QO21zRhcI-bTXA][76dea3e6d424][inet[/]], reason: zen-disco-join (elected_as_master)
[2015-06-15 09:10:39,446][INFO ][http                     ] [Energizer] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/]}
[2015-06-15 09:10:39,446][INFO ][node                     ] [Energizer] started
[2015-06-15 09:10:39,479][INFO ][gateway                  ] [Energizer] recovered [0] indices into cluster_state

The important part of the script:

for package in c.fetchall():
    id_package, tracking_number, order_number, payment_info, shipment_provider_name, package_status_name=package
    el['tracking_number'] = tracking_number
    el['order_number'] = order_number
    el['payment_info'] = payment_info
    el['shipment_provider_name'] = shipment_provider_name
    el['package_status_name'] = package_status_name

    requests.put("http://localhost:9200/packages/package/%s/_create"%(id_package), json=el)


So, it wasn't a problem with either Docker or Elastic. Just to recap, the same script throwning PUT requests at a Elasticsearch setup locally worked, but when throwning at a container with Elasticsearch failed after a few thousand documents (20k). To note that the overal number of documents was roughtly 800k.

So, what happend? When you setup somethig running on localhost and make a request to it (in this case a PUT request) that request goes through the loopback interface. In pratice ths means that no TCP connection gets created making a lot faster.

When the docker container was setup, ports were bound to the host. Although the script still makes requests to localhost on the desired port, a TCP connection gets created between the host and the docker container through the docker0 interface. This comes at the expense of 2 things:

This is actually a more realistic scenario. We setup Elasticsearch on another machine and did the exact same test and got, as expected, the same result.

The problem was that we were sending to requests and for each of them creating a new connection. Due to the way TCP works, connections cannot be closed immediately. Which meant that we were using all available connections until we got none to use because the rate of creation was higher the actual close rate.

Three suggestions to fix this:

  1. Pause requests every once in a while. Maybe put a sleep at every X requests making possible for the TIME_WAIT to pass and the connection closing
  2. Send the the Connection: close header: option for the sender to signal that the connection will be closed after completion of the response.
  3. Reuse connection(s).

I ended up going with option 3) and rewrote my collegue's script and reusing the same TCP connection.


Docker container http requests limit

