For a quite a while I've been getting "bugs in scheduling while atomic" kernel messages. I seem to recall there were some issues with SMP and mkiss at some point in the past. This isn't a hardware problem since the issue remains after putting together a completely new system. This is currently a machine running debian wheezy i386 userland with a x86_64 kernel. ax25_rebuild_header is in all of these dumps. Seems suspicious. The hardware is a i7-4770K CPU @ 3.50GHz with 16 gigs of ram, dual ethernet ports (acting as a router), a serial kiss port to a TNC and an AXIP port.
ham related modules in use:
ipip 12941 0 tunnel4 12629 1 ipip ip_tunnel 21436 1 ipip netrom 36534 4 mkiss 17161 2 ax25 54676 60 mkiss,netrom
dmesg
[10433.518914] Hardware name: MSI MS-7850/Z87-G41 PC Mate(MS-7850), BIOS V1.2 06/07/2013 [10433.518915] 0000000000000000 ffff88002e21e7c0 ffffffff814b98af ffff8803f603e000 [10433.518917] ffffffff814b6f16 ffff88040eb93800 ffffffff814bd1ad 0000000000000000 [10433.518918] ffff88041fac3bd8 ffff8803f603ffd8 ffff8803f603ffd8 ffff8803f603ffd8 [10433.518919] Call Trace: [10433.518920] <IRQ> [<ffffffff814b98af>] ? dump_stack+0x41/0x51 [10433.518927] [<ffffffff814b6f16>] ? __schedule_bug+0x46/0x55 [10433.518928] [<ffffffff814bd1ad>] ? __schedule+0x5cd/0x780 [10433.518931] [<ffffffff8108d3dd>] ? __cond_resched+0x1d/0x30 [10433.518932] [<ffffffff814bd3d7>] ? _cond_resched+0x27/0x30 [10433.518934] [<ffffffff814bc209>] ? mutex_lock_interruptible+0x9/0x40 [10433.518942] [<ffffffffa0309c08>] ? rp_write+0x68/0x340 [rocket] [10433.518943] [<ffffffffa08adf0d>] ? ax_xmit+0x1ad/0x440 [mkiss] [10433.518946] [<ffffffff813d2669>] ? dev_hard_start_xmit+0x319/0x500 [10433.518948] [<ffffffff8106ac18>] ? internal_add_timer+0x18/0x50 [10433.518950] [<ffffffff813f010d>] ? sch_direct_xmit+0xfd/0x1d0 [10433.518951] [<ffffffff813d2a40>] ? dev_queue_xmit+0x1f0/0x490 [10433.518954] [<ffffffffa08988f8>] ? ax25_rebuild_header+0x108/0x2b0 [ax25] [10433.518956] [<ffffffff813d9e3d>] ? neigh_compat_output+0x8d/0xa0 [10433.518957] [<ffffffff8140a4d1>] ? ip_finish_output+0x1b1/0x3a0 [10433.518959] [<ffffffff8143ec85>] ? igmp_ifc_timer_expire+0x175/0x280 [10433.518960] [<ffffffff8143eb10>] ? igmp_group_added+0x170/0x170 [10433.518962] [<ffffffff8106ab1c>] ? call_timer_fn+0x2c/0x100 [10433.518963] [<ffffffff8143eb10>] ? igmp_group_added+0x170/0x170 [10433.518964] [<ffffffff8106c0d5>] ? run_timer_softirq+0x1f5/0x2a0 [10433.518967] [<ffffffff812860f1>] ? timerqueue_add+0x61/0xb0 [10433.518969] [<ffffffff81063bbe>] ? __do_softirq+0xde/0x220 [10433.518970] [<ffffffff814c875c>] ? call_softirq+0x1c/0x30 [10433.518973] [<ffffffff810155b5>] ? do_softirq+0x75/0xb0 [10433.518974] [<ffffffff81063e65>] ? irq_exit+0xa5/0xb0 [10433.518977] [<ffffffff810407cb>] ? smp_apic_timer_interrupt+0x3b/0x50 [10433.518979] [<ffffffff814c7a9d>] ? apic_timer_interrupt+0x6d/0x80 [10433.518979] <EOI> [<ffffffff814c8a2c>] ? sysenter_dispatch+0x7/0x21
Thanks for any ideas.
Bob Brose / N0QBJ
Hmmm... I've been running various EPEL kernel on Centos 6 (currently on 3.10.5) on an i5 / 4GB machine with a locally compiled version of the VE7FET AX.25 apps/libs/tools (64bit versions.. not i386) without any issues for a few years now. This is all running a TNC in KISS mode using the mkiss and netrom modules. Maybe you can install the 64bit versions of the AX25 stuff and have better luck?
I also see there are references of IGMP in there... are you using any multicast in your setup? Maybe RIP?
--David
On 01/22/2014 09:02 PM, Robert Brose wrote:
(Please trim inclusions from previous messages) _______________________________________________ For a quite a while I've been getting "bugs in scheduling while atomic" kernel messages. I seem to recall there were some issues with SMP and mkiss at some point in the past. This isn't a hardware problem since the issue remains after putting together a completely new system. This is currently a machine running debian wheezy i386 userland with a x86_64 kernel. ax25_rebuild_header is in all of these dumps. Seems suspicious. The hardware is a i7-4770K CPU @ 3.50GHz with 16 gigs of ram, dual ethernet ports (acting as a router), a serial kiss port to a TNC and an AXIP port.
ham related modules in use:
ipip 12941 0 tunnel4 12629 1 ipip ip_tunnel 21436 1 ipip netrom 36534 4 mkiss 17161 2 ax25 54676 60 mkiss,netrom
dmesg
[10433.518914] Hardware name: MSI MS-7850/Z87-G41 PC Mate(MS-7850), BIOS V1.2 06/07/2013 [10433.518915] 0000000000000000 ffff88002e21e7c0 ffffffff814b98af ffff8803f603e000 [10433.518917] ffffffff814b6f16 ffff88040eb93800 ffffffff814bd1ad 0000000000000000 [10433.518918] ffff88041fac3bd8 ffff8803f603ffd8 ffff8803f603ffd8 ffff8803f603ffd8 [10433.518919] Call Trace: [10433.518920] <IRQ> [<ffffffff814b98af>] ? dump_stack+0x41/0x51 [10433.518927] [<ffffffff814b6f16>] ? __schedule_bug+0x46/0x55 [10433.518928] [<ffffffff814bd1ad>] ? __schedule+0x5cd/0x780 [10433.518931] [<ffffffff8108d3dd>] ? __cond_resched+0x1d/0x30 [10433.518932] [<ffffffff814bd3d7>] ? _cond_resched+0x27/0x30 [10433.518934] [<ffffffff814bc209>] ? mutex_lock_interruptible+0x9/0x40 [10433.518942] [<ffffffffa0309c08>] ? rp_write+0x68/0x340 [rocket] [10433.518943] [<ffffffffa08adf0d>] ? ax_xmit+0x1ad/0x440 [mkiss] [10433.518946] [<ffffffff813d2669>] ? dev_hard_start_xmit+0x319/0x500 [10433.518948] [<ffffffff8106ac18>] ? internal_add_timer+0x18/0x50 [10433.518950] [<ffffffff813f010d>] ? sch_direct_xmit+0xfd/0x1d0 [10433.518951] [<ffffffff813d2a40>] ? dev_queue_xmit+0x1f0/0x490 [10433.518954] [<ffffffffa08988f8>] ? ax25_rebuild_header+0x108/0x2b0 [ax25] [10433.518956] [<ffffffff813d9e3d>] ? neigh_compat_output+0x8d/0xa0 [10433.518957] [<ffffffff8140a4d1>] ? ip_finish_output+0x1b1/0x3a0 [10433.518959] [<ffffffff8143ec85>] ? igmp_ifc_timer_expire+0x175/0x280 [10433.518960] [<ffffffff8143eb10>] ? igmp_group_added+0x170/0x170 [10433.518962] [<ffffffff8106ab1c>] ? call_timer_fn+0x2c/0x100 [10433.518963] [<ffffffff8143eb10>] ? igmp_group_added+0x170/0x170 [10433.518964] [<ffffffff8106c0d5>] ? run_timer_softirq+0x1f5/0x2a0 [10433.518967] [<ffffffff812860f1>] ? timerqueue_add+0x61/0xb0 [10433.518969] [<ffffffff81063bbe>] ? __do_softirq+0xde/0x220 [10433.518970] [<ffffffff814c875c>] ? call_softirq+0x1c/0x30 [10433.518973] [<ffffffff810155b5>] ? do_softirq+0x75/0xb0 [10433.518974] [<ffffffff81063e65>] ? irq_exit+0xa5/0xb0 [10433.518977] [<ffffffff810407cb>] ? smp_apic_timer_interrupt+0x3b/0x50 [10433.518979] [<ffffffff814c7a9d>] ? apic_timer_interrupt+0x6d/0x80 [10433.518979] <EOI> [<ffffffff814c8a2c>] ? sysenter_dispatch+0x7/0x21
Thanks for any ideas.
Bob Brose / N0QBJ
44Net mailing list 44Net@hamradio.ucsd.edu http://hamradio.ucsd.edu/mailman/listinfo/44net
The igmp reference is interesting. I'm not doing any dynamic routing. On this particular gateway, I'm still downloading the encap entries. There are no igmp or rip kernel modules loaded. It does have a simple iptables NAT setup. It's IPV4 only.
The kernel is: Linux kunk 3.12.6-reb-x64 #3 SMP Wed Jan 15 17:05:53 CST 2014 x86_64 GNU/Linux compiled with the stock debian .config just added the bsd pty support for axip. I needed the newer kernel for the onboard ethernet and haswell video.
I've seen the same issue with the stock kernels for a while, 3.2, 3.10 for example that's why I was thinking it might be a known issue.
I'm pretty sure it was happening before I started running the x64 kernel with i386 userland too.
Bob
"David Ranch amprgw@trinnet.net says:"
Hmmm... I've been running various EPEL kernel on Centos 6 (currently on 3.10.5) on an i5 / 4GB machine with a locally compiled version of the VE7FET AX.25 apps/libs/tools (64bit versions.. not i386) without any issues for a few years now. This is all running a TNC in KISS mode using the mkiss and netrom modules. Maybe you can install the 64bit versions of the AX25 stuff and have better luck?
I also see there are references of IGMP in there... are you using any multicast in your setup? Maybe RIP?
--David