PDA

View Full Version : Migration from vm3 to vm6


deeperbydesign
04-30-2005, 07:59 PM
Hello, I have a request regarding the upcoming migration from the ailing vm3 host to your new vm6 machine. Can you please post some notice before the migration starts and keep those of us affected notified of the status of things?

I think that a good bit of the frustration that people have had with this service lately is not because unixshell is doing a bad job but rather that we don't know what you are doing. That leads people to assume the worst. I can't speak for others but that has certainly been the case for me, maybe others feel the same way.

Cheers,
Brian

cmtech
04-30-2005, 09:06 PM
Yes I feel the same and have been in private communication with many other users. This is the first I've heard that vm3 is ailing although I already knew it and have been sending support tickets requesting a transfer.

Brian is absolutely right, we can all understand teething problems and be sympathetic but leaving us in the dark just isn't on.

Edit: I have had no reply from unixshell regarding my completely broken VPS on VM3 for 36 hours now.

Edit2: Brian you have elected not to receive private e-mails it would be good if we could communicate via private method. My e-mail address is robert.pitt@gmail.com and if you send me a private e-mail maybe we can help eachother.

matta
04-30-2005, 09:24 PM
The ETA was stated at http://www.unixshell.com/forum/showthread.php?t=212&page=3. Tonight or tomorrow it should come online, we then need to migrate the VM's.

deeperbydesign
04-30-2005, 09:29 PM
I had seen that thread and am glad things will be fixed so quickly.

I think that my request might not have been clear. I am interested in knowing when you are actually ready with the new server so that I can expect my vm to be down during the migration. Hopefully it can be done quickly and outside of business hours? Please let us know before our vm's just go away.

Status during the migration and any help with headaches due to it would be great too. I assume we will not be changing IPs?

Thanks in advance!

cmtech
04-30-2005, 09:43 PM
You could have told us in an e-mail. We cannot be expected to read through random threads on your forum for reasons why our service is broken. I still have received no reply to my support ticket although I am glad that this at least has drawn some reaction.

Come on guys, I was a happy Tektonic customer for a whole year. All we want is to be kept informed as you can see from Brian's positive reaction right here.

matta
04-30-2005, 10:15 PM
The migrations will be performed at night EST, it'll be posted to the Outages forum. The migration should only be a few seconds of downtime per VM and nothing will change as far as IP's or login information.

cmtech
04-30-2005, 10:19 PM
Thanks. My VPS is completely dead and cannot be revived using your Teknic control panel even by installing a new OS. I am confident it will come back online as soon has you have performed the migration. It is a shame I've been kept in the dark, I am a reasonable guy. When the service is rock solid and stable like I get from your Tektonic service I will be voicing my approval.

Edit: Perhaps a handy list of service status links would be a good idea?

matta
04-30-2005, 10:23 PM
I show your VM as up... I connected to the console and get a login prompt. VM3 VM's function just fine... it's just that it has been crashing about every 2 days, but when it is up it is fine. As shown on our support page it's our responsibility to ensure the host stays up and support any problems in Teknic, but if you have a problem inside your VM it is up to you to fix.

deeperbydesign
04-30-2005, 10:36 PM
Sounds great, looking forward to some extended uptime. I'll watch for the status info in the Outages forum then, thank you.

cmtech
04-30-2005, 10:42 PM
ssh -l citymind echo -p 7322
citymind@echo's password:
Last login: Sat Apr 30 01:15:22 2005 from cpc1-brig4-5-0-cust169.brig.cable.ntl.com
************ REMOTE CONSOLE: CTRL-] TO QUIT ********

************ REMOTE CONSOLE EXITED *****************

Yes and now too my normal services within the VPS work. This was not the case four hours ago. I do not know if you changed something or if VM3 magically corrected itself. Please do not make out it was my fault that your system screwed up and my VPS was offline for 30+ hours. It does not magically fix itself. If it does - the magic problem is not my issue. It is not my responsibility to fix your broken VM3. Please do not make it out as such.

jkf
05-01-2005, 04:56 AM
I've run into issues where my VM won't reboot without me hitting Teknic. I can attach to the console, issue a reboot in the VM, watch linux shutdown and then just hang until I (re)load my teknic page. Then the VM will start booing. I've had several downtimes that appear to be multiple hours in duration where the VM was down, can't attach to the console, but as soon as I hit Teknic, I could attach to the console and it was booting. I don't know if it is a quirk of the configuration on VM3, but it started after the problems awhile ago, when we were forced to use a FC kernel. I assume that was when VM3 was upgraded to FC4, so maybe that has something to do with it. All I know is I'm soooo looking forward to the migration to VM6 and what I hope will be stable service. :)

matta
05-01-2005, 08:06 AM
It should all work out well, all the other servers are running very well these days. I'll update if there are delays with our supplier/datacenter. I expect to have the server by Sunday and have it setup and ready for migrations by Monday.

cmtech
05-01-2005, 11:25 AM
Good. Incidently, I have submitted 7 support tickets (vs 3 in the whole time on Tektonic 1 of which was to buy more resources). 1 was a duplicate. 1 was my fault for not reading the documentation properly on usage of the SSH forwarded console. 1 is half questionable because switching single user on the control panel does not instantly switch runlevels as you would expect based on the behaviour of init and not documenting this fact. It is however in the message that comes up later and I claimed responsibility for that half of that ticket. Both of these happened in the very first week of using the unixshell service while I was learning about it and are related to their specific system rather than anything within my VM. The other half of that ticket and 3 of the others appear to be exactly the same issue you describe here (http://www.unixshell.com/forum/showthread.php?p=979) in post 979. That is, my system would respond to ICMP echo but no connections could be formed to my VPS, the SSH console or Teknic control panel. This happened both from my cable modem and from my Tektonic (not unixshell) VPS. Obviously the problem is not an external network issue.

Of my remaining support tickets unixshell replied to one of them, saying it was a blocked port, not my fault and have ignored the other two and a half (half being they answered the half of my ticket that was my fault and ignored the other half) which apparantly corrected themselves and I surmise likely due to VM3 trouble they repaired and did not see fit to tell me about.

The latest ticket which has been again ignored for over 48 hours now was a different problem I will now describe in great detail and verbatim here so there is no confusion. I changed the kernel in the Teknic control panel. After doing so, pressing the start button would not bring my VPS back up. No matter which kernel I used, the result was the same. I replaced the system with a fresh OS install which we both know shouldn't in any way effect the operation of Teknic just so unixshell couldn't say "it's your VM problem" which they did anyway. Obviously this had no effect and the problem persisted, so I restored my VM image. Connecting to console dropped me through to "invalid domain" - as it does when the system is offline. Now I logged into it for the first time just now since reappearing. It appears to have been up constantly since it re-appeared last time. Out of curiosity, I shut it down again changed the kernel twice, once to 2.6-latest and then back to 2.4-latest which I switched it to before it broke last time. Exactly the same problem has occured - it will not come back online.

After trying every sequence of clicks I can think of in Teknic, it is still down. Incidently I noticed this small piece of incorrect information while flicking all the options:

Single user mode for citymind set to off. After you reboot your server you will need to access it through the SSH console.

I do not need to access it via SSH console because I turned single user off not on. Anyways, lets start her up...

Starting citymind .. please wait
Starting citymind ..
citymind started.

Operation sucessful: Return to main

Great so my VPS is up now right?

Last login: Sun May 1 05:41:39 2005 from cpc1-brig4-5-0-cust169.brig.cable.ntl.com
Error: invalid domain:citymind
Connection to echo closed.

Nope. So this is a problem with my VM and I better restore one of these blank images. Having a preference for slackware, I'll go with that.

Server is currently performing a reinstall. When complete the Status on the servers index will change from Installing to Down. You can then start the server.
Staus: Installing

A cup of tea, a cigarette and a Sunday morning row between the neighbours later, the VPS is done installing and staus is now:

Status: Down

OK. Clean slate, lets fire her up...

Starting citymind .. please wait
Starting citymind ..
citymind started.

Operation sucessful: Return to main

But...

Status: Down

And...

Last login: Sun May 1 05:44:28 2005 from cpc1-brig4-5-0-cust169.brig.cable.ntl.com
Error: invalid domain:citymind
Connection to echo closed.

So guys, truce? What am I doing wrong here? You tell me. I'm restoring my VPS snapshot now and will await your reply.

Edit: Scrolling back up over when I was checking it over I noticed it now had kernel: 2.6.11-prep before I changed it again instead of that Fedora kernel. What was that a preperation of migration kernel? I do not see it in the list of kernel options you can choose from Teknic...

matta
05-02-2005, 06:30 PM
The problem seems to be conflicts between Xen included with FC4 and the Linux 2.4 kernel. When a VM is started with the 2.4 kernel it does not start, but instead starts a zombie VM. It seems you've been switching your kernels around and that is why you were experiencing the problem when no one else on the server was. Until migrations are complete please just use the 2.6 kernel, in benchmarks the performance of the 2.6 kernel is always at least 40% faster.

I believe in Xen 3.0 the Linux 2.4 kernel will not officially be supported anymore.

matta
05-03-2005, 04:20 PM
The server has not been setup yet, again there seems to be a shortage of the motherboard we use at our supplier. For now VM3 seems to have stabilized a bit (over 4 days uptime currently) and we'll continue to keep a close eye on it. Normally hardware such as this is shipped quickly, but I do not have a set ETA.

matta
05-05-2005, 07:54 PM
As an update still waiting on the motherboard apparently. I'm working (read: yelling at people) to get this setup as soon as possible.

matta
05-06-2005, 04:13 PM
I just heard back from our vendor on the delay. We use the Tyan S2881 motherboard (look it up if you want... very nice board, were not using cheap hardware here) and apparently there is a huge backorder on these boards. I'm looking into quick alternatives at this point, I really want to get VM3 fixed as i'm not too fond on receivng pages regarding it at all hours of the day. I can get a Dual Opteron board today (same used for VM1 actually) and I will see if we can start migrations tonight.

deeperbydesign
05-06-2005, 04:36 PM
That sounds great, thanks for keeping us up to date on the situation. I wanted to get a clarification on the choice of hardware for vm6 though. It IS hardware you have used before and found to be stable/compatible for your service (vm1 has been stable)? You plan to move us to vm6 and then allocate new customers to vm3 or you are planning to move us back to vm3 after re-installing it's host OS?

Thanks and good luck with the procurement/migration!

matta
05-06-2005, 05:21 PM
vm1 has been up for 27 days as of now. It is good motherboard, just the Tyan we use has PCI-X, 8 RAM slots, and the dual on-board Tigon3 NIC's. The alternative motherboard being used is still good, it is just minus those features.

matta
05-06-2005, 05:22 PM
Didn't notice the second part, VM3 customers will be moved to VM6. VM3 will be rebuilt and new customers will be placed on it.

matta
05-07-2005, 03:58 AM
VM6 is now online and being configured. I will be migrating some clients tonight and the rest tomorrow morning.

griffinn
05-07-2005, 05:01 AM
Before migrating each of us over, please do an orderly shutdown of our hosts. It's always unnerving to see dmesg logs like this and be left wondering which inodes have been deleted:EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: sda1: orphan cleanup on readonly fs
ext3_orphan_cleanup: deleting unreferenced inode 587575
ext3_orphan_cleanup: deleting unreferenced inode 1029115
ext3_orphan_cleanup: deleting unreferenced inode 97926
ext3_orphan_cleanup: deleting unreferenced inode 97925
ext3_orphan_cleanup: deleting unreferenced inode 97924
EXT3-fs: sda1: 5 orphan inodes deleted
EXT3-fs: recovery complete.

matta
05-07-2005, 05:22 AM
Yes, it will be a clean shutdown and i'm aiming for only a few minutes of downtime per VM.

zeroion
05-07-2005, 07:08 AM
Is there a way we can volunteer our accounts to be transferred over first?

matta
05-09-2005, 04:58 AM
All the migrations have been completed. It ran with minimal problems and downtime.

zeroion
05-09-2005, 07:42 AM
Not to complain, but I noticed that all filesystem ACLs were lost when you performed the migrations. You might want to use star instead of tar to backup the filesystems in the future.

matta
05-09-2005, 05:57 PM
It wasn't tar... it was rsync with -aS options. We'll have to see if future versions of rsync will support the ACL's.

cmtech
05-10-2005, 12:48 PM
So far since migration my VPS seems stable with over 3 days (disregarding a reboot by me) of uptime when before I couldn't achieve 2. Looking good, hope it continues, and thank you for being more transparent with what was really happening.