Website Inaccessibility July 27, 2017

  • 1
  • Problem
  • Updated 2 years ago

The website appears to be inaccessible at the moment: (7:26am EST). I have a PC (Windows 10) and have been using Chrome as a browser. I had the Eterna site open on several pages (one  tab displaying puzzles and several tabs with puzzles from that page which I was in the process of solving). After solving one puzzle and then closing out that tab I then noticed that the puzzle that was on the next tab over appeared frozen (though I didn’t use that tool). When I then tried to reload this tab I was presented with the following message:

https://prnt.sc/g0zy21

Clicking the “OK” button removed the little window, but otherwise no change to the page was observed. As errors are sometimes very temporary instances I decided to refresh the page. The page that then loaded displayed a different message as seen below.

https://prnt.sc/g0zyeu

  Also, this appears to affect the whole site including the link to the forum which was unable to direct me here.  

    Perhaps the site is just under necessary maintenance at the moment, but I wished to share what was seen on my side in case it proved important. 

Photo of cynwulf28

cynwulf28

  • 80 Posts
  • 22 Reply Likes
  • Not sure how to feel. Not too worried yet :-)

Posted 2 years ago

  • 1
Photo of cynwulf28

cynwulf28

  • 80 Posts
  • 22 Reply Likes
Whoops, was looking at the date when I gave the time. My apologies for any confusion. The correct time (EST) is displayed in the screenshots I posted. 
    Please Note: The site appears to be operational again as of 10:13 AM of this same day. 
Photo of LFP6

LFP6, Player Developer

  • 600 Posts
  • 108 Reply Likes
Thanks for the report. It seems like we've been having more of this kind of thing pop up as of late. If the dev team has any insight into what's going on I'd be interested to hear about it, otherwise I think this is another instance of needing to rebuild our platform sooner rather than later.
Photo of rhiju

rhiju, Researcher

  • 403 Posts
  • 122 Reply Likes
we also got an alarm from Amazon Web Services, which hosts the eterna servers. ('UnHealthyHostCount'). We'll discuss at an upcoming developer's meeting.
Photo of rhiju

rhiju, Researcher

  • 403 Posts
  • 122 Reply Likes
Devs have been looking more closely at server logs and didn't notice anything funny regarding an unexpected number of queries, or failures by the eterna daemons or the servers, which are hosted by Amazon Web Services.  Our current suspicion is that Amazon gateway itself had a transient problem, which also happened earlier this summer. Hopefully won't happen again -- but if it does, its out of our control! 
Photo of cynwulf28

cynwulf28

  • 80 Posts
  • 22 Reply Likes

 Hello again. I have decided to post similar problems here rather than create a new thread to make it easier to track such data.

     I have been experiencing what “appear” to be flash-based malfunctions while using Eterna for a week or two now. [please note that I have also experienced Chat crashes for about a month with regular frequency, usually about 3 times a day and lasting from 5 to 50 minutes at a time. I mention this here in case the issues are somehow related].

      The types of problems I have run into over the last month include what I have already commented on, but within the last week I have experienced a new (for myself) type of freeze/crash of the site and its features. Here I present screenshots from a single instance on July 30th. Times are included in the screenshots and are in Eastern Time. Note that I took as many screenshots as I did, not for the sake of determining the problem alone, but also to show the effect this had on the site’s various features.

   In the first 6 screenshots are from the same page (a player-made puzzle I was attempting to solve). The text of the puzzle’s name, the counts for GC and GU pairs, the Total energy display, and the link to the home page fail to display properly:

https://prntscr.com/g4b8u4

 

The in-game settings menu also failed to display text for the listed options:

https://prntscr.com/g4b8wl

 

When went to “reset” the puzzle, the usual pop-up message displayed as seen in the next shot:

https://prntscr.com/g4b8yl

 The responsiveness of the fields to be clicked seemed to be unaffected despite the lack of text, and I was able to select the “yes” or “no” option which then appeared in order to reset the bases:

https://prntscr.com/g4b95z

  I then pasted the sequence I had back into the puzzle, the window for which is seen below again without text:

https://prntscr.com/g4b98i

  As only the text seemed to be affected by the error, rather than the icons or the page overall, it seemed that the problem had to do with the transitive aspects of the page’s display.

     However, when I went to formally reload/refresh the page I was faced with the following message about “Shockwave Flash” not being able to load:

https://prntscr.com/g4b9d9

    I also had another tab open with a puzzle which I had “beamed to puzzlemaker”. That page had frozen and when I reloaded/refreshed that page I got a different error stating that not only that “Shockwave Flash” could not be reloaded, but that it had “crashed”:

https://prntscr.com/g4b9ex

 

   The “Player Puzzle” page seemed to be less severely affected, though the Chat was disabled through a similar error (Correct me if I am wrong, but **I thought we changed Chat from Flash-based to HTML-based**)

https://prntscr.com/g4b9hv

 

       I have encountered this problem at least twice including once around 1 AM (ET) today, August 4th 2017 which also happened to preceded an outage in Comcast's Internet service. That outage might be related, but I don't have enough data to calculate p values

Photo of rhiju

rhiju, Researcher

  • 403 Posts
  • 122 Reply Likes
hi -- we're following this issue. There was another Amazon Web Services problem yesterday, with the site down for ~30 minutes. it would be useful to hear if other players saw the same problem. Thanks, for the devs.
Photo of whbob

whbob

  • 190 Posts
  • 57 Reply Likes
I was not online on 3 Aug. 2017.
Masterstormer had asked if the site was down at 2AM PST on the 28th & 30th of July.  
2 Aug. 2017 at about 4:45 PST the site went down for me for about 15 minutes.  My time zone is EST.

When I have experienced the problem, I have used both the secure and non secure urls and both were down. Don't know if they are from the same place or not.

Would a mirror site help define the problem (to see if it is software or hardware)?
   
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 966 Posts
  • 304 Reply Likes
@cynwulf28 FWIW, I don't think Flash crashes and general site unavailability stem from the same cause.  

Before I bit the bullet and upgraded my memory at the beginning of the year, Flash crashes were quite common for me.  Monitoring memory made it clear that the Chrome/Flash combination will continue to increase memory consumption until Flash crashes.  How badly that affects you will depend on things like how much available memory you have, how many Eterna tabs you keep open at one time and how often you restart your machine.  You might try monitoring your memory and see whether this has any relevance to your issues.
Photo of cynwulf28

cynwulf28

  • 80 Posts
  • 22 Reply Likes
Thank you for the replies. My PC specs are as follow: Edition: Windows 10 Home. Version: 1703. Processor: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz 3.20 GHz. Installed RAM: 16.0 GB. System type: 64-bit operating system, x64-based processor.

     My local storage is currently at 177 GB out of 931 GB (753 GB remaining). 

I admit that I often have multiple tabs/windows open (out of necessity). I'll have the "Home" page, and/or "Player Puzzles" page open so that I have access to both puzzles and the menus that link to resources such as Past Labs, the Wiki, or Forum. I then will also have open the page for whichever "Puzzle" and/or "Lab" I am working on at the time and may also have "Puzzle Maker" open to aid in dissecting harder to solve features.  
    If there is another puzzle (which I already solved) with shared characteristics then I may open that puzzle in yet another tab/window to compare structures and sequence. Alternatively if one of the many instructional puzzles [currently compiling a registry of these by the way] contains relevant information I might have that open. 
   From time to time newer players will request aid on a puzzle (almost always, but not exclusively from the "New[er] Progression"), then I may briefly open up a new tab with the puzzle so that I can lend them advice on how to go about solving the puzzle and/or troubleshoot (simple) technical problems for them. Occasionally the best advice I can give is to share a relevant link {Shout out to Linkbot!} which sometimes requires myself to open yet another tab.
     I was told once that anything beyond 5 open tabs for the site could potentially cause (chat-related) issues, but hadn't considered other potential risks.  

      Thank you for the suggestion, I'll try to economize my pages going forward.   
Photo of machinelves

machinelves

  • 155 Posts
  • 23 Reply Likes
Hey I am just chiming in to acknowledge that Omei's advice about ram resonates with my experience, since my macbook pro has 16 GB ram like your own machine, and overheats much worse and more quickly than my pc with 32 GB ram. There may be multiple factors, I am guessing, possibly like mac OS window handling being terrible, flash not being well suited to this much data, the code itself having memory sinks or mishandlings, and finally our computers not having enough ram. I have to defer to Omei or anyone who can speak to the specific causes of slowness or crashing, and there are often different factors at play.

So I don't know what the exact sources of the problems are, but having more ram can help, though may not solve the issues.

For myself, to ensure I do not melt down my machine, I only use eterna on the 32 GB pc for now, and do not generally get above 75 C temps or have any crashing. If you can upgrade your machine with another stick of 16 GB ram, it's one of the easiest things to just pop in, and you could probably find someone to help if needed. If you like having lots of tabs and apps open, it's a huge help.

That thing about not opening more than 5 tabs in eterna just means chat will stop loading new replies and may not register input on the 6th+ tab[s], but I have opened 20-40 tabs of puzzles and complete/submit no problem.

So this is why I want to echo Omei's comment on ram and resource management - those of us using lots of open apps, windows, or tabs may need more ram in general. Where is my supercomputer?! Watsoooooon!!! 

If you haven't already, it is fun to try out that Resource Monitor app Omei mentioned, it's cool to see what your computer is crunching on.

Good luck!

p.s. thanks for outlining the specifics of any issues you encounter, it is a huge help!!
(Edited)
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 966 Posts
  • 304 Reply Likes
@cynwulf, your symptoms align well the ones I had before increasing my memory.

The "local storage" number you quoted is your disk storage; it's the RAM that's relevant here.  It's been awhile since I have used Windows regularly, but I would guess the easiest way to monitor your RAM usage is with the Resource Monitor.  (See, for example http://www.techrepublic.com/article/how-to-use-windows-10s-resource-monitor-to-track-memory-usage/).  

You might try rebooting your machine, bringing up Eterna in one tab and checking your RAM usage. (The Processes tab is probably the useful because it not only has a Used Memory meter, but you can also monitor the Hard Faults, which generally go real high when the machine doesn't have enough free memory to handle smoothly what you're doing.)   Then continue with your normal usage of Eterna while keeping an eye on how it changes when you start having issues with Eterna.
Photo of Omei Turnbull

Omei Turnbull, Player Developer

  • 966 Posts
  • 304 Reply Likes
An update on this last unavailability - examining the Apache (Web server) logs, I can see that the server continued to operate throughout the outage.  I don't recall whether I posted it here, but this is consistent with what the Amazon console was saying during the outage -- that the Eterna server instance was fine, but the load balancer that sits between the server instance and the Internet couldn't communicate with it.  I also noticed that Amazon had posted they had been having EC2 outages earlier in the day at the same service center (North Virginia) that hosts us.

Although this doesn't constitute proof that the outage was beyond Eterna's control, that is where the preponderance of evidence points.