Run a Linux process on a remote server and disconnect terminal session
I've been writing some nifty python scripts lately as we (WalkJogRun) migrate our route data from XML files to Google Maps encoded polylines. I finally whipped the script into shape and started pulling our route data from Amazon S3, encoding as SQL update statements and then rinse and repeat for nearly 500,000 legacy XML files. Trouble is, 500,000 files on my local machine would take about 20 hours.
Amazon AWS to the rescue! I span up a new Amazon EC2 medium size instance (high CPU) and pushed my script up to the cloud. The beauty of this is that the bandwidth between EC2 and S3 is free whereas my local machine relied on a crappy network connection and incurs the bandwidth charge on our S3 account. Ordinarily S3 bandwidth is pretty cheap but the script running on our EC2 machine is pulling down 5gb of data per minute and processing 1,000 files per minute. In around 8 hours we should have every legacy file processed and converted to a SQL statement we can run against our MySQL database.
But wait. My terminal timed out. Doh. If I login to the server over terminal and start the python script it is connected to my shell so if the connection breaks (on my crappy internet connection) the process is terminated on the server. The batch isn't as smart as it could be so it doesn't gracefully restart where it left off so I called on my friendly linux whiz Scott Frazer for a solution.
Scott directed me to the NOHUP command and after a few minutes of reading, a little typing and restarting my first batch I was able to disconnect my session thanks to this helpful command:
If you run this command, exit and log back in you'll see the PTS/1 replaced by ? when you run the ps -fe command. I'm able to track the progress of my script because each iteration/file writes a line to the sql file it generates and the handy wc -l command tells me how many lines are in that file.
So thanks to Scott I'll have all my new shiny encoded polylines for our database and we can start to eliminate the need for the legacy XML files. The background there is that Google Maps on the iPhone and in the browser both support polyline encoding such that instead of a large xml file of latitude and longitude coordinates (some as big as 40kb!) I get an encoded string I can insert into the database each no more than 1000 characters.