Message boards :
Number crunching :
Проект TBEG недоступен
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 14 Jan 19 Posts: 119 Credit: 574 RAC: 0 |
Dear Crunchers and Natalia! Basically, the database was corrupted again. Do not panic! I have very recent backup (few minutes before the incident) and the recently returned results are also archived. I tried to restore the backup, it ran for a few hours and then corrupted again. And then again on another attempt. Additionally, I tried to install mariadb on my PC, and it just refused to work. Thank you crunchers for your contributions. I am very angry about this and I need to take a break. |
Send message Joined: 6 Apr 17 Posts: 14336 Credit: 0 RAC: 0 |
Tomas Brada Очень сожалею. Однажды я задала вам вопрос: "Почему повреждается БД?" Вы не ответили на этот вопрос. Повреждение БД повторяется. Наверное, надо искать причину: почему БД периодически повреждается? Если бы это был только единичный случай, можно было бы понять - какой-то случайный сбой техники. Но когда это происходит периодически - надо найти причину и устранить её. Такая работа проекта никуда не годится! I have very recent backup (few minutes before the incident) and the recently returned results are also archived. Это тоже мало понятно. Почему резервная копия БД не восстановилась и раз, и два? Можно предположить, что и резервная копия повреждена. Это возможно? Что делать в этом случае? К сожалению, история повторяется. В проекте Stop@home тоже повредилась БД. Резервной копии, как я понимаю, не было. Проект был остановлен. Я разочаровалась в BOINC-проектах. Смотрите сообщение https://boinc.progger.info/odlk/forum_thread.php?id=1&postid=5301#5301 Предлагаю вам перестать мучиться с BOINC и перейти в ручной подпроект, если вы хотите мне помогать в моих исследованиях. Это успешно делает XAVER и раньше делал Demis. У меня есть эффективные алгоритмы поиска ОДЛК, вы в этом убедились. Мне очень не хватает вычислительных ресурсов! Мои мольбы к администраторам проектов ODLK и ODLK1 не возымели никакого действия. Там крутится алгоритм грубой силы. Ну пусть крутится до скончания века. Кстати, о БД результатов по ОДЛК в проекте TBEG. Я обрабатывала результаты до последнего, последнюю порцию сняла вчера. Вся БД решений у меня есть. Первая часть (более миллиона КФ ОДЛК) была мной выложена. Вторая часть на данный момент содержит 717728 КФ ОДЛК. Завтра выложу вторую часть БД. На этом я ставлю точку в БД КФ ОДЛК вашего проекта. Если вы будете продолжать проект, займитесь также обработкой результатов. Это можно сделать автоматически. |
Send message Joined: 6 Apr 17 Posts: 14336 Credit: 0 RAC: 0 |
Выкладываю обещанную вторую часть БД проекта TBEG https://cloud.mail.ru/public/5gJK/58nEGzmF5 The results of TBEG project for 4/11/2019 - 24/02/2020. Я уже объясняла наличие трёх авторов, когда выкладывала первую часть БД. Итого в двух частях БД проекта TBEG содержится на данный момент 1745285 КФ ОДЛК. Tomas Brada я уже писала выше, что прекращаю обработку БД КФ ОДЛК вашего проекта. Если вы будете продолжать проект, пожалуйста, сделайте автоматическую обработку результатов. Это очень просто сделать, вы хорошо с этим знакомы. Продолжайте начатую мной БД. Мне трудно работать с такими количествами квадратов на моём ПК с 2 Гб оперативной памяти. Поэтому я ставила условие перед запуском эксперимента, что будет автоматическая обработка результатов. Пока её нет. PS. Первая часть БД проекта TBEG выложена тут https://boinc.progger.info/odlk/forum_thread.php?id=104&postid=4754#4754 |
Send message Joined: 14 Jan 19 Posts: 119 Credit: 574 RAC: 0 |
We will see how it turns out. Regarding the backup: I am very sure that my backup of the project is not corrupted. It is the server that is having problems, probably the disk, corrupting the active version of the database. The DB has checksums. Good news: The backup imported on database installed on my computer and SPT results are slowly, but surely, being processed. I will still take a break from working on the server. Maybe a week or two. |
Send message Joined: 21 Jun 17 Posts: 4 Credit: 3,239,510 RAC: 0 |
Hi Tomas So, the backup is ok but the server is broken? Now the tricky question: When do you think the project will be online again? I am sure that many (including me) have completed tasks that are now just sitting on their computers waiting to be uploaded...maybe you can make a new (temporary) server available and just allow everyone to upload their tasks ? (But not to download new tasks). And perhaps, you can send out a message via BOINC Manager to let people know what is going on? regards Tim |
Send message Joined: 14 Jan 19 Posts: 119 Credit: 574 RAC: 0 |
Hi Timbo! > So, the backup is ok but the server is broken? Yes, the backup is consistent. The baskup imported into my computer also looks OK, it is currently being processed. It is likely a failing disk, but I am not sure. Could also be memory or the mysql software. I need to buy a new disk. > Now the tricky question: When do you think the project will be online again? I am sure that many (including me) have completed tasks that are now just sitting on their computers waiting to be uploaded...maybe you can make a new (temporary) server available and just allow everyone to upload their tasks ? (But not to download new tasks). It will be online when it is done, no ETA. I could enable file uploads, but I would risk corruption of them as well. I will think about it. > And perhaps, you can send out a message via BOINC Manager to let people know what is going on? No, this is not possible right now. Thank you, Timbo, for writing. |
Send message Joined: 21 Jun 17 Posts: 4 Credit: 3,239,510 RAC: 0 |
Hi Tomas Many thanks for the swift reply. I will pass on the info to our Team so that the current members involved with your project (just 4 including myself) are aware of the situation. I can't do anything to help inform the other 333 active members who have joined the project though :-( re: Server "fix". Good luck getting the server fixed. If it is local to you, then it could be very simple to get working again, if it just needs a new HDD...as I am sure that not only do you have DB backups but also OS backups as well (which will allow the original project messageboard, teams and members databases to be restored to a new HDD). If it were me, my plan, would be to test the server itself is OK, by running some basic tests from a USB memory stick or from a simple Linux installer from a CDROM/DVDROM (but removing the original HDD first !!). Once I knew the server was OK, then install a new HDD and restore the OS backup. If I ever suspect a HDD problem, it is cheaper in the long run to just get a new (and probably larger) HDD...trying to save a few dollars on the old HD doesn't make sense if it fails again later on. Just my 2c !! regards Tim |
Send message Joined: 14 Jan 19 Posts: 119 Credit: 574 RAC: 0 |
I agree, The HDD will be tossed. The static system files are installed on raid1 btrfs over a SD-card and the HDD. Btrfs is checksummed filesystem, and it reports no erros, so I know that is not corrupted. The dynamic data files (boinc upload/download and misc) are on btrfs on single over the HDD with metadata dup, again, it's checksummend and reports no errors. The database (boinc results, science results, including the message boards, nextcloud) was on InnoDB on ext4 in writeback over the HDD. InnoDB is checksummed database, and it reported no errors at the time of the backup, so I know the backup is correct. It did report out-of-bounds errors and crashed. SMARTctl did NOT report errors and also InnoDB did NOT report checksum errors, which is strange. I have incremental semi-automatic backups over ssh set up. Even the boot partition was backed up for fast recovery in case of card or disk failure. I had spare card and spare disk on hand. And it did happen once, disk failed back in the alpha stage, but at that time the db and data partitions were small and raided over to the card, so it was quick swap. And that's why I have no spare disk now. The requirements for the disk are minor. It just has to be 2.5" sata. The DB is under 8 gigs and data is under 32 gigs. I am considering getting an SSD instead, because small ssds cost the same as small rust-drives today, but I had bad experience with a similar small SSD. While we are here, the processing finished and will show some stats. The counts include duplicates and disabled groups, but the SPT(24), SPT(17), STPT(16) and TPT(10*2) are rare and exciting. ok=357052 inval=270 Count SPT: 9:1415304 10:0 11:320979 12:15032 13:6131 14:6598462 15:60 16:530544 17:1 18:25702 19:0 20:1314 21:0 22:69 23:0 24:4 Count STPT: 8:40655218 9:0 10:472777 11:0 12:22856 13:0 14:25 15:0 16:2 Count TPT: 6:2107451 7:98346 8:3084 9:88 10:2 # DB select kind k start ofs spt 17 589492143270716899 24 30 60 6 72 12 6 12 spt 24 528050771271601307 14 6 66 6 22 26 4 30 8 4 18 8 spt 24 587950582712698157 2 22 12 6 24 30 14 10 2 54 18 28 spt 24 22930603692243271 70 6 42 18 20 4 18 24 20 16 12 128 tpt 10 3324648277099157 52 16 10 64 10 16 58 22 28 tpt 10 31910610414019031 34 22 94 28 10 10 58 106 46 stpt 16 2640138520272677 2 10 2 16 2 22 2 34 |
Send message Joined: 6 Apr 17 Posts: 14336 Credit: 0 RAC: 0 |
While we are here, the processing finished and will show some stats. The counts include duplicates and disabled groups, but the SPT(24), SPT(17), STPT(16) and TPT(10*2) are rare and exciting. Обсуждение здесь https://boinc.progger.info/odlk/forum_thread.php?id=49&postid=5322#5322 |
Send message Joined: 6 Apr 17 Posts: 14336 Credit: 0 RAC: 0 |
nabializm ваш пост скрыт, как не имеющий отношения к теме. |
Send message Joined: 14 Jan 19 Posts: 119 Credit: 574 RAC: 0 |
The server received an upgrade of one SSD to replace the presumably faulty HDD. SSD was chosen, because it costs the same as harddrive. I have migrated logical volumes over to the new drive, updated the kernel, updated the operating system, recompiled latest boinc serve and currently I am testing the new drive. I have a small suspicion that there is electrical interference or loose cable somewhere. Anyway, I re-enabled the file upload handler, so the last outstanding files can be uploaded, even before the database and full operation is resumed. |
Send message Joined: 6 Apr 17 Posts: 14336 Credit: 0 RAC: 0 |
Tomas Brada большое спасибо за информацию. Есть надежда на скорое возвращение проекта? |
Send message Joined: 14 Jan 19 Posts: 119 Credit: 574 RAC: 0 |
Announce: Forum and server pages are now back online. |
Send message Joined: 14 Jan 19 Posts: 119 Credit: 574 RAC: 0 |
Announce: power outage taken out the server boot filesystem. We restore it from backup. And take time to install battery backup. |
Send message Joined: 6 Apr 17 Posts: 14336 Credit: 0 RAC: 0 |
Есть тут кто-нибудь живой? У меня уже дня три не открывается проект TBEG, такое выдаётся Это только у меня проблема? Пожалуйста, отзовитесь кто-нибудь. My new article "SOLS and SODLS" in Russian https://yadi.sk/d/nvdI6TgBrKv72A in English https://yadi.sk/d/VeY9bx6_q6CcZg |
Send message Joined: 6 Apr 17 Posts: 14336 Credit: 0 RAC: 0 |
Сегодня проект TBEG у меня открывается. Правда, пока не вижу там ничего нового. My new article "SOLS and SODLS" in Russian https://yadi.sk/d/nvdI6TgBrKv72A in English https://yadi.sk/d/VeY9bx6_q6CcZg |
©2024 (C) Progger