What to do if your Amazon EC2 instance becomes unresponsive?

I have been very badly hurt by this, as our programmers were very actively working on PHP application and all of a sudden yesterday my Aamzon EC2 instance stopped responding to all TCP/UDP communication. I tried everything including ec2-reboot but nothing worked out, there were few major changes in the code which were not backed up and I fear losing them.

The distribution was CentOS 5 and I was running apache and ha-proxy on the instance, ha-proxy was running for a possible new instance launch to accommodate the load. Thanks that I had MySQL db on a different instance. The instance was launched using RightScale (because my client wanted it). The RightScale guys had some code changes in their Dashboard but I doubt that will effect the running instance.

I am not going to terminate the instance until there is the a last ray of hope. I am posting this question to some major forums including slashdot and others in the hope that I might get some help. So, if any of you have ever faced this situation, how you dealt with it? How you recovered your data from EC2 instance? And how did you get back you instance to respond to TCP communication?

UPDATE: That was a hardware fault and after sometime the instance came back to normal without losing anything. 🙂 Thanks everyone who helped me during that time.


Tags: , , , , , , ,

5 Responses to “What to do if your Amazon EC2 instance becomes unresponsive?”

  1. Edward M. Goldberg Says:

    Things that I try first are:

    Launch a second server in the same zone and group and try to ping and login from the “Inside” of EC2. If for some reason the external network interface was updated, for example an EIP was re-assigned. The server may only be reached by the known “10.” address or internal address.

    Next try an EIP attach to that instance. This (after 3 min.) resets all of the network routes for this instance and may clear up the route tables at AWS. It is worth a try. It does not cost much for an EIP for a day or two.

    If all else fails, learn by this this mistake. Next time mount an EBS Volume at the file system level for all of the code you are working on and have a good SNAPSHOT of all of your work.

    You can create a 1G Volume and do a mount under /var/www/htdoc for example and have a backup of ever change you make “off instance” for just the click of the mouse! I use Elastic Fox for this trick.

    I hope this is of help, please send me a note if I can help more.

    I have my head in the Clouds, and love to spread the word.

    Edward M. Goldberg

  2. Hameedullah Khan Says:


    Thanks for posting the possible solution. But unfortunately I have tried all of these options and was thinking that these might not be enough. I had 2 other instances running in the same security group or in RightScale “the 3 instances were part of same Deployment”. So I tried to ping the 10. address and everything from other instances and also on EIP, but nothing worked.

    Thanks for the SNAPSHOT suggestion, I had taken backups but as I said there were some major code changes by programmers and I was on holiday when this happened, so those major changes are lost.

    Once again thanks.

  3. Eugene Says:

    I am looking for some idea and stumble upon your posting 🙂 decide to wish you Thanks. Eugene

  4. Hameedullah Khan Says:

    I am glad the post was of your help.

  5. http://www.cambogiafruit.com/ Says:

    Who knew that retail sales in the series, we’re just on for victory.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: