OTRS Not sending mail - URGENT

I have an OTRS Jumpbox that is failing to send mail.
On doing some elementary sleuthing, I find the following. Note the 100% used on /dev/sda1.

Output of df-h

admin@eadmin:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 965M 916M 0 100% /
varrun 1014M 84K 1014M 1% /var/run
varlock 1014M 0 1014M 0% /var/lock
udev 1014M 44K 1014M 1% /dev
devshm 1014M 0 1014M 0% /dev/shm
/dev/sdb1 100G 24G 71G 26% /storage

In /var/log/mail.log, I see this error:

Jun 24 07:08:06 pcbca-otrs2 postfix/postdrop[30055]: warning: uid=33: No space left on device
Jun 24 07:08:06 pcbca-otrs2 postfix/sendmail[30054]: fatal: (33): queue file write error

Please help - URGENT!

OTRS Not sending mail - URGENT

Well, its gonna be a search and destroy mission for whatever has filled up your disk. Chances are it's logfiles in /var/log. So to get the 20 largest files in /var/log, do:

ls -lhS /var/log/ | head -n 20

If thats not it, start checking the size of subdirectories under / (excluding /storage). For instance, to see the size of things under /var/ do this:

sudo du -sh /var/*

One thing that can take up space is if you have installed packages, the old deb files get left around. They can be cleaned up with:

sudo apt-get clean

But if you haven't done apt-get install for anything, it won't do any good.

Austin

Results of above

Results of above commands:

admin@eadmin:/var$ ls -lhS /var/log/ | head -n 20
total 6.1M
-rw-r----- 1 syslog adm 553K Jun 24 13:30 messages
-rw-r----- 1 syslog adm 551K Jun 24 13:30 user.log
-rw-r----- 1 syslog adm 548K Jun 24 13:30 auth.log
-rw-r----- 1 syslog adm 449K Jun 20 06:47 auth.log.0
-rw-r----- 1 syslog adm 413K Jun 20 06:46 user.log.0
-rw-r----- 1 syslog adm 412K Jun 20 06:46 messages.0
-rw-r--r-- 1 root root 352K Jun 24 12:53 udev
-rw-r----- 1 syslog adm 336K Jun 19 22:04 mail.info.0
-rw-r----- 1 syslog adm 336K Jun 19 22:04 mail.log.0
-rw-r--r-- 1 root root 286K Jun 24 13:21 lastlog
-rw-r----- 1 syslog adm 204K Mar 10 19:06 kern.log.0
-rw-r----- 1 syslog adm 103K Jun 24 13:28 mail.info
-rw-r----- 1 syslog adm 103K Jun 24 13:28 mail.log
-rw-r----- 1 syslog adm 79K Jun 18 06:44 user.log.1.gz
-rw-r----- 1 syslog adm 79K Jun 18 06:44 messages.1.gz
-rw-r----- 1 syslog adm 78K Jun 11 06:44 user.log.3.gz
-rw-r----- 1 syslog adm 78K Jun 11 06:44 messages.3.gz
-rw-r----- 1 syslog adm 76K Jun 4 06:32 user.log.4.gz
-rw-r----- 1 syslog adm 76K Jun 4 06:32 messages.4.gz

admin@eadmin:/var$ sudo du -sh /var/*
383K /var/backups
31M /var/cache
24G /var/data
227M /var/lib
1.0K /var/local
0 /var/lock
386M /var/log
393K /var/mail
1.0K /var/opt
84K /var/run
307K /var/spool
1.0K /var/tmp
9.0K /var/www

Also logged into MySQL and ran this query to display space used:

mysql> SELECT
-> table_schema AS 'Db Name',
-> Round( Sum( data_length + index_length ) / 1024 / 1024, 3) AS 'Db Size (MB)',
-> Round( Sum( data_free ) / 1024 / 1024, 3 ) AS 'Free Space (MB)'
-> FROM information_schema.tables
-> GROUP BY table_schema ;
+--------------------+--------------+-----------------+
| Db Name | Db Size (MB) | Free Space (MB) |
+--------------------+--------------+-----------------+
| information_schema | 0.004 | 0.000 |
| mysql | 0.396 | 0.000 |
| otrs | 162.292 | 0.343 |
+--------------------+--------------+-----------------+
3 rows in set (0.02 sec)

Look around more in /var/log.

Look around more in /var/log. This is your issue:

386M	/var/log

Must be in one of the subdirectories. Do the

sudo du -sh /var/*

Austin

Large Error and Access log

Large Error and Access log files:

admin@eadmin:/var/log/apache2$ sudo du -sh /var/log/apache2/*
47M /var/log/apache2/access.log
144M /var/log/apache2/access.log.1
3.3M /var/log/apache2/access.log.2.gz
3.6M /var/log/apache2/access.log.3.gz
3.3M /var/log/apache2/access.log.4.gz
178M /var/log/apache2/error.log
1.5M /var/log/apache2/error.log.1
20K /var/log/apache2/error.log.2.gz
977K /var/log/apache2/error.log.3.gz
6.0K /var/log/apache2/error.log.4.gz

logs

So either your log rotation has stopped working, you now have so much traffic that the weekly rotation is no longer keeping up with it, or there is something bad going on that is generating more log info than usual.

Once you get some deleted, you might want to take a look at that giant error.log and see what nastiness is in there. And check the first and last entries of both the access and error logs.:

head -n 1 /var/log/apache2/access.log | awk '{print $4}'
tail -n 1 /var/log/apache2/access.log | awk '{print $4}'

and then:

head -n 1 /var/log/apache2/error.log | awk '{print $4}'
tail -n 1 /var/log/apache2/error.log | awk '{print $4}'

Austin

admin@eadmin:~$ head -n 1

admin@eadmin:~$ head -n 1 /var/log/apache2/access.log | awk '{print $4}'
[20/Jun/2010:06:39:11
admin@eadmin:~$ tail -n 1 /var/log/apache2/access.log | awk '{print $4}'
[24/Jun/2010:14:07:55

In the error.log file, I'm seeing a lot of this kind of error - thousands and thousands, all at around the same timestamp:

[Tue Jun 22 15:57:49 2010] -e: Malformed UTF-8 character (unexpected non-continuation byte 0x74, immediately after start byte 0xea)$

This most likely indicates we got an email from China. Those have been giving me problems all month...

As for the access.log, I have most users screens auto-refreshing every 2 minutes. I'll be looking at resetting this to 5 minutes to slow down the growth rate of the log file.

apache logs

Ok, well as far as the logs are concerned we can disable them entirely, or we can increase the log rotation rate. The default rotation schedule is weekly and saving several. We can fully control these. Its up to you.

Austin

Success - somewhat...

Ran these:

admin@eadmin:/var/log/apache2$ sudo gzip access.log.1
admin@eadmin:/var/log/apache2$ sudo gzip error.log.1

Now:

admin@eadmin:/var/log/apache2$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 965M 775M 142M 85% /
varrun 1014M 84K 1014M 1% /var/run
varlock 1014M 0 1014M 0% /var/lock
udev 1014M 44K 1014M 1% /dev
devshm 1014M 0 1014M 0% /dev/shm
/dev/sdb1 100G 24G 71G 26% /storage

Still need a long term solution though...

Success - somewhat...

Yeah, see my other post.

Austin

Delete the old ones

It is safe to delete any of those logfiles that end with 1 or 2.gz 3.gz etc.

sudo rm /var/log/apache2/access.log.1
sudo rm /var/log/apache2/*.gz

Austin

Looks like it's

Looks like it's apache2.

admin@eadmin:/var$ sudo du -sh /var/log/*
380M /var/log/apache2
2.0K /var/log/apt
552K /var/log/auth.log
452K /var/log/auth.log.0
61K /var/log/auth.log.1.gz
18K /var/log/auth.log.2.gz
63K /var/log/auth.log.3.gz
62K /var/log/auth.log.4.gz
61K /var/log/auth.log.5.gz
63K /var/log/auth.log.6.gz
1.0K /var/log/boot
0 /var/log/btmp
0 /var/log/btmp.1
0 /var/log/daemon.log
3.0K /var/log/daemon.log.0
1.0K /var/log/daemon.log.1.gz
1.0K /var/log/daemon.log.2.gz
1.0K /var/log/daemon.log.3.gz
0 /var/log/debug
56K /var/log/debug.0
37K /var/log/dmesg
37K /var/log/dmesg.0
10K /var/log/dmesg.1.gz
10K /var/log/dmesg.2.gz
10K /var/log/dmesg.3.gz
10K /var/log/dmesg.4.gz
0 /var/log/dpkg.log
2.0K /var/log/dpkg.log.1
3.0K /var/log/fsck
0 /var/log/kern.log
205K /var/log/kern.log.0
3.0K /var/log/lastlog
0 /var/log/lpr.log
0 /var/log/mail.err
1.0K /var/log/mail.err.0
104K /var/log/mail.info
339K /var/log/mail.info.0
43K /var/log/mail.info.1.gz
42K /var/log/mail.info.2.gz
37K /var/log/mail.info.3.gz
104K /var/log/mail.log
339K /var/log/mail.log.0
43K /var/log/mail.log.1.gz
42K /var/log/mail.log.2.gz
37K /var/log/mail.log.3.gz
0 /var/log/mail.warn
1.0K /var/log/mail.warn.0
1.0K /var/log/mail.warn.1.gz
557K /var/log/messages
415K /var/log/messages.0
80K /var/log/messages.1.gz
24K /var/log/messages.2.gz
79K /var/log/messages.3.gz
77K /var/log/messages.4.gz
73K /var/log/messages.5.gz
73K /var/log/messages.6.gz
1.0K /var/log/mysql
0 /var/log/mysql.err
1.0K /var/log/news
12K /var/log/samba
12K /var/log/syslog
24K /var/log/syslog.0
23K /var/log/syslog.1.gz
37K /var/log/syslog.2.gz
19K /var/log/syslog.3.gz
21K /var/log/syslog.4.gz
39K /var/log/syslog.5.gz
40K /var/log/syslog.6.gz
1.0K /var/log/sysstat
355K /var/log/udev
555K /var/log/user.log
416K /var/log/user.log.0
80K /var/log/user.log.1.gz
24K /var/log/user.log.2.gz
79K /var/log/user.log.3.gz
77K /var/log/user.log.4.gz
73K /var/log/user.log.5.gz
73K /var/log/user.log.6.gz
1.0K /var/log/vmware-tools-guestd
10K /var/log/wtmp
0 /var/log/wtmp.1