Project

General

Profile

Bug #1051

Core becomes unresponsive, possibly because of database being locked

Added by dazjorz about 13 years ago. Updated over 7 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
Quassel Core
Target version:
-
Start date:
02/09/2011
Due date:
% Done:

0%

Estimated time:
Version:
0.7-pre
OS:
Linux

Description

I'm using Quassel 0.7.1, compiled myself from a Debian source package on my Debian Wheezy system (the new testing). I use a SQLite database for storage (libsqlite3 version 3.7.4). This is a multi-user Quasselcore installation.

I upgraded my Quasselcore to 0.7.1 about three weeks ago, and it has been running without problems. Today, around 03:15:00 tonight (last message I received in a busy channel), SQL query errors started appearing in my core.log. An info-level log is attached. The following error starts to appear:

2011-02-09 03:26:26 Error: unhandled Error in QSqlQuery!  
2011-02-09 03:26:26 Error:                   last Query: UPDATE networkSET connected = :connectedWHERE userid = :userid AND networkid = :networkid  
2011-02-09 03:26:26 Error:               executed Query: UPDATE networkSET connected = ?WHERE userid = ? AND networkid = ?  
2011-02-09 03:26:26 Error:                 bound Values: :connected=1, :networkid=12, :userid=4  
2011-02-09 03:26:26 Error:                 Error Number: 5  
2011-02-09 03:26:26 Error:                Error Message: database is locked Unable to fetch row  
2011-02-09 03:26:26 Error:               Driver Message: Unable to fetch row  
2011-02-09 03:26:26 Error:                   DB Message: database is locked  

This error appears for multiple queries, as can be seen in the logs. Seeing that this error occurs at 3 AM at night, and the core is usually used by about four people at the same time during the day, it seems unlikely that this is a multi-user issue. One of the IRC networks the Quasselcore had several connections to was unstable/down/whatever that night (other people couldn't connect either, but I wasn't there to get more information), and when someone tried to connect, he got no further than 'synchronising to core' (this is also in the logs at 07:15). I ran a quasselcore restart at around 17:54 the same day.

Setting priority to High because this makes the core unresponsive, requiring manual administrator access to 'fix'. Setting version to 0.7-pre, as 0.7.1 is not listed. If you have any tips for extra checks to add in case this happens again, I'll implement them in the current installation so the next additions to this bug report will be much more useful.

core.log (7.55 KB) core.log One night's worth of log messages dazjorz, 02/09/2011 06:44 PM

History

#1 Updated by dazjorz about 13 years ago

Happened again today (I noticed within five seconds). Log contains about the same. Looks like quassel-storage.sqlite is used by the same process a few times:

$ sudo lsof quassel-storage.sqlite
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
quasselco 21231 quasselcore 7u REG 202,2 721222656 499746 quassel-storage.sqlite
quasselco 21231 quasselcore 16u REG 202,2 721222656 499746 quassel-storage.sqlite
quasselco 21231 quasselcore 17u REG 202,2 721222656 499746 quassel-storage.sqlite
quasselco 21231 quasselcore 18u REG 202,2 721222656 499746 quassel-storage.sqlite
quasselco 21231 quasselcore 35u REG 202,2 721222656 499746 quassel-storage.sqlite

Maybe two of these file handles are accessed for queries at the same time. It's weird that this has only just started to happen. Maybe it's because the database is almost 700 MB; doing one update to that file might take long enough for another thread to start updating the database too, causing a race condition. But that's just speculation.

#2 Updated by dazjorz about 13 years ago

dazjorz wrote:

Happened again today (I noticed within five seconds).

Minutes. I'm not that fast. :/

#3 Updated by milian almost 13 years ago

I encounter something similar - though I have no log file hence don't see that query personally. What I do see is the backtrace when attaching to a "hanging" quasselcore:

(gdb) thread apply all bt

Thread 7 (Thread 0xb6718b70 (LWP 16220)):
#0 0xb7726424 in kernel_vsyscall ()
#1 0xb6ecb696 in *__GI
_poll (fds=0xb6f4eff4, nfds=1, timeout=29970) at ../sysdeps/unix/sysv/linux/poll.c:87
#2 0xb6d6d06b in g_poll () from /lib/libglib-2.0.so.0
#3 0xb6d5fdbc in ?? () from /lib/libglib-2.0.so.0
#4 0xb6d601c8 in g_main_context_iteration () from /lib/libglib-2.0.so.0
#5 0xb7614075 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#6 0xb75e6ae9 in QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#7 0xb75e6f3a in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#8 0xb74e9629 in QThread::exec() () from /usr/lib/libQtCore.so.4
#9 0x080bc2f3 in SessionThread::run() ()
#10 0xb74ebfbe in ?? () from /usr/lib/libQtCore.so.4
#11 0xb7468955 in start_thread (arg=0xb6718b70) at pthread_create.c:300
#12 0xb6ed8e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 6 (Thread 0xb5f17b70 (LWP 16221)):
#0 0xb7726424 in __kernel_vsyscall ()
#1 0xb746cf7f in __pthread_cond_wait (cond=0xa1d7f70, mutex=0xa1d7f58) at pthread_cond_wait.c:153
#2 0xb74ecfb7 in QWaitCondition::wait(QMutex*, unsigned long) () from /usr/lib/libQtCore.so.4
#3 0xb74e7013 in QReadWriteLock::lockForWrite() () from /usr/lib/libQtCore.so.4
#4 0x080bf5f1 in SqliteStorage::logMessage(Message&) ()
#5 0x080ecc8f in CoreSession::processMessages() ()
#6 0x080ed1f3 in CoreSession::customEvent(QEvent*) ()
#7 0xb75f8e5c in QObject::event(QEvent*) () from /usr/lib/libQtCore.so.4
#8 0xb75e78b3 in QCoreApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/lib/libQtCore.so.4
#9 0xb75e8033 in QCoreApplication::notify(QObject*, QEvent*) () from /usr/lib/libQtCore.so.4
#10 0xb75e84cb in QCoreApplication::notifyInternal(QObject*, QEvent*) () from /usr/lib/libQtCore.so.4
#11 0xb75ebf2c in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) () from /usr/lib/libQtCore.so.4
#12 0xb75ec0ad in QCoreApplication::sendPostedEvents(QObject*, int) () from /usr/lib/libQtCore.so.4
#13 0xb761457f in ?? () from /usr/lib/libQtCore.so.4
#14 0xb6d5c305 in g_main_context_dispatch () from /lib/libglib-2.0.so.0
#15 0xb6d5ffe8 in ?? () from /lib/libglib-2.0.so.0
#16 0xb6d601c8 in g_main_context_iteration () from /lib/libglib-2.0.so.0
#17 0xb7614075 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#18 0xb75e6ae9 in QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#19 0xb75e6f3a in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#20 0xb74e9629 in QThread::exec() () from /usr/lib/libQtCore.so.4
#21 0x080bc2f3 in SessionThread::run() ()
---Type <return> to continue, or q <return> to quit---
#22 0xb74ebfbe in ?? () from /usr/lib/libQtCore.so.4
#23 0xb7468955 in start_thread (arg=0xb5f17b70) at pthread_create.c:300
#24 0xb6ed8e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 5 (Thread 0xb55ffb70 (LWP 16222)):
#0 0xb7726424 in __kernel_vsyscall ()
#1 0xb746cf7f in __pthread_cond_wait (cond=0xb733526c, mutex=0xb7335254) at pthread_cond_wait.c:153
#2 0xb7257427 in ?? () from /usr/lib/libQtScript.so.4
#3 0xb7257471 in ?? () from /usr/lib/libQtScript.so.4
#4 0xb7468955 in start_thread (arg=0xb55ffb70) at pthread_create.c:300
#5 0xb6ed8e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 4 (Thread 0xb43ffb70 (LWP 16223)):
#0 0xb7726424 in __kernel_vsyscall ()
#1 0xb746d482 in pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_timedwait.S:179
#2 0xb74ecf3f in QWaitCondition::wait(QMutex*, unsigned long) () from /usr/lib/libQtCore.so.4
#3 0xb74e0f6b in ?? () from /usr/lib/libQtCore.so.4
#4 0xb74ebfbe in ?? () from /usr/lib/libQtCore.so.4
#5 0xb7468955 in start_thread (arg=0xb43ffb70) at pthread_create.c:300
#6 0xb6ed8e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 3 (Thread 0xb3bfeb70 (LWP 16224)):
#0 0xb7726424 in __kernel_vsyscall ()
#1 0xb746d482 in pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_timedwait.S:179
#2 0xb74ecf3f in QWaitCondition::wait(QMutex*, unsigned long) () from /usr/lib/libQtCore.so.4
#3 0xb74e0f6b in ?? () from /usr/lib/libQtCore.so.4
#4 0xb74ebfbe in ?? () from /usr/lib/libQtCore.so.4
#5 0xb7468955 in start_thread (arg=0xb3bfeb70) at pthread_create.c:300
#6 0xb6ed8e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 2 (Thread 0xb33fdb70 (LWP 16225)):
#0 0xb7726424 in __kernel_vsyscall ()
#1 0xb746d482 in pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_timedwait.S:179
#2 0xb74ecf3f in QWaitCondition::wait(QMutex*, unsigned long) () from /usr/lib/libQtCore.so.4
#3 0xb74e0f6b in ?? () from /usr/lib/libQtCore.so.4
#4 0xb74ebfbe in ?? () from /usr/lib/libQtCore.so.4
#5 0xb7468955 in start_thread (arg=0xb33fdb70) at pthread_create.c:300
#6 0xb6ed8e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

---Type <return> to continue, or q <return> to quit---
Thread 1 (Thread 0xb6cec6d0 (LWP 16219)):
#0 0xb7726424 in __kernel_vsyscall ()
#1 0xb6ea4516 in nanosleep () at ../sysdeps/unix/syscall-template.S:82
#2 0xb6ed2a4c in usleep (useconds=100000) at ../sysdeps/unix/sysv/linux/usleep.c:33
#3 0xb675c802 in ?? () from /usr/lib/libsqlite3.so.0
#4 0xb6727f0e in ?? () from /usr/lib/libsqlite3.so.0
#5 0xb6722cf8 in ?? () from /usr/lib/libsqlite3.so.0
#6 0xb6722370 in ?? () from /usr/lib/libsqlite3.so.0
#7 0xb673ff62 in ?? () from /usr/lib/libsqlite3.so.0
#8 0xb6748692 in ?? () from /usr/lib/libsqlite3.so.0
#9 0xb678e463 in ?? () from /usr/lib/libsqlite3.so.0
#10 0xb6779670 in sqlite3_step () from /usr/lib/libsqlite3.so.0
#11 0xb698aa31 in ?? () from /usr/lib/qt4/plugins/sqldrivers/libqsqlite.so
#12 0xb698bb22 in ?? () from /usr/lib/qt4/plugins/sqldrivers/libqsqlite.so
#13 0xb70a066c in QSqlQuery::exec() () from /usr/lib/libQtSql.so.4
#14 0x080bca70 in SqliteStorage::safeExec(QSqlQuery&, int) ()
#15 0x080bcb06 in SqliteStorage::safeExec(QSqlQuery&, int) ()
#16 0x080bcb06 in SqliteStorage::safeExec(QSqlQuery&, int) ()
#17 0x080bcb06 in SqliteStorage::safeExec(QSqlQuery&, int) ()
#18 0x080bcb06 in SqliteStorage::safeExec(QSqlQuery&, int) ()
#19 0x080c3413 in SqliteStorage::validateUser(QString const&, QString const&) ()
#20 0x0808b673 in Core::processClientMessage(QTcpSocket*, QMap<QString, QVariant> const&) ()
#21 0x0808e4ce in Core::clientHasData() ()
#22 0x080d6d7b in Core::qt_metacall(QMetaObject::Call, int, void**) ()
#23 0xb75ed7aa in QMetaObject::metacall(QObject*, QMetaObject::Call, int, void**) () from /usr/lib/libQtCore.so.4
#24 0xb75fc1bb in QMetaObject::activate(QObject*, QMetaObject const*, int, void**) () from /usr/lib/libQtCore.so.4
#25 0xb764c4c7 in QIODevice::readyRead() () from /usr/lib/libQtCore.so.4
#26 0xb7401645 in ?? () from /usr/lib/libQtNetwork.so.4
#27 0xb73f6878 in ?? () from /usr/lib/libQtNetwork.so.4
#28 0xb73fa526 in QSslSocket::qt_metacall(QMetaObject::Call, int, void**) () from /usr/lib/libQtNetwork.so.4
#29 0xb75ed7aa in QMetaObject::metacall(QObject*, QMetaObject::Call, int, void**) () from /usr/lib/libQtCore.so.4
#30 0xb75fc1bb in QMetaObject::activate(QObject*, QMetaObject const*, int, void**) () from /usr/lib/libQtCore.so.4
#31 0xb764c4c7 in QIODevice::readyRead() () from /usr/lib/libQtCore.so.4
#32 0xb73dd902 in ?? () from /usr/lib/libQtNetwork.so.4
#33 0xb73ca86b in ?? () from /usr/lib/libQtNetwork.so.4
#34 0xb73cbc33 in ?? () from /usr/lib/libQtNetwork.so.4
#35 0xb75e78b3 in QCoreApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/lib/libQtCore.so.4
#36 0xb75e8033 in QCoreApplication::notify(QObject*, QEvent*) () from /usr/lib/libQtCore.so.4
#37 0xb75e84cb in QCoreApplication::notifyInternal(QObject*, QEvent*) () from /usr/lib/libQtCore.so.4
---Type <return> to continue, or q <return> to quit---
#38 0xb761444a in ?? () from /usr/lib/libQtCore.so.4
#39 0xb6d5c305 in g_main_context_dispatch () from /lib/libglib-2.0.so.0
#40 0xb6d5ffe8 in ?? () from /lib/libglib-2.0.so.0
#41 0xb6d601c8 in g_main_context_iteration () from /lib/libglib-2.0.so.0
#42 0xb7614075 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#43 0xb75e6ae9 in QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#44 0xb75e6f3a in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQtCore.so.4
#45 0xb75ec16f in QCoreApplication::exec() () from /usr/lib/libQtCore.so.4
#46 0x0807da74 in main ()
(gdb)

this might be a sqlite / debian bug, can someone reproduce it? Esp. the sleep for 1000 seconds sounds really strange to me!

#4 Updated by Robsdedude over 7 years ago

Encountered the same issue.

Quassel IRC: v0.12.4 (dist-90183ee)
Running with sqlite: 3.8.7.1 2014-10-29 13:59:56 3b7b72c4685aa5cf5e675c2c47ebec10d9704221
on Debian 8

any fixes? Restarting the core causes the same issue again...

Also available in: Atom PDF