一次触摸屏中断调试引发的深入探究

作者:heaven 发布于:2018-4-23 16:01 分类:Linux内核分析

大家好,我叫张昺华,中间那个字念“饼”,首先非常感谢陈莉君老师的指点,题目名字也是陈老师起的,也很荣幸此文章能在蜗窝上发表一次,感谢郭大侠给的机会

如下为本人原创,在解决问题的过程中的一点心得,如果有描述不准确的地方还请各位指出,非常感谢


Linux内核版本:linux-4.9.18 

曾有一次调试触摸屏的时候遇到如下的问题

 

/startup/modules #

 [  233.370296] irq 44: nobody cared (try booting with the "irqpoll" option)

[  233.376983] CPU: 0 PID: 0 Comm: swapper Tainted: G           O    4.9.18 #8

[  233.383912] Hardware name: Broadcom Cygnus SoC

[  233.388378] [<c010cbfc>] (unwind_backtrace) from [<c010a5fc>] (show_stack+0x10/0x14)

[  233.396103] [<c010a5fc>] (show_stack) from [<c0145d38>] (__report_bad_irq+0x24/0xa4)

[  233.403821]

[<c0145d38>] (__report_bad_irq) from [<c0145fdc>] (note_interrupt+0x1c8/0x274)

[  233.412052]

[<c0145fdc>] (note_interrupt) from [<c014400c>] (handle_irq_event_percpu+0x44/0x50)

[  233.420715]

[<c014400c>] (handle_irq_event_percpu) from [<c0144040>] (handle_irq_event+0x28/0x3c)

[  233.429550]

[<c0144040>] (handle_irq_event) from [<c0146574>] (handle_simple_irq+0x70/0x78)

[  233.437868]

[<c0146574>] (handle_simple_irq) from [<c01438d8>] (generic_handle_irq+0x18/0x28)

[  233.446366]

[<c01438d8>] (generic_handle_irq) from [<c02adb3c>] (iproc_gpio_irq_handler+0xd0/0x11c)

[  233.455376]

[<c02adb3c>] (iproc_gpio_irq_handler) from [<c01438d8>] (generic_handle_irq+0x18/0x28)

[  233.464297]

[<c01438d8>] (generic_handle_irq) from [<c0143980>] (__handle_domain_irq+0x80/0xa4)

[  233.472959]

[<c0143980>] (__handle_domain_irq) from [<c01013d0>] (gic_handle_irq+0x50/0x84)

[  233.481275] [<c01013d0>] (gic_handle_irq) from [<c010b02c>] (__irq_svc+0x6c/0x90)

[  233.488723] Exception stack(0xc0901f60 to 0xc0901fa8)

[  233.493754] 1f60: c0112900 c0717028 c0901fb8 00000000 c093af4c 00000000 00000335 c0826220

[  233.501896] 1f80: 00000001 414fc091 df9eab80 00000000 c0900038 c0901fb0 c010843c c0108440

[  233.510034] 1fa0: 60000013 ffffffff

[  233.513514] [<c010b02c>] (__irq_svc) from [<c0108440>] (arch_cpu_idle+0x2c/0x38)

[  233.520887] [<c0108440>] (arch_cpu_idle) from [<c013a6ec>] (cpu_startup_entry+0x50/0xc0)

[  233.528956] [<c013a6ec>] (cpu_startup_entry) from [<c0800d70>] (start_kernel+0x414/0x4b0)

[  233.537097] handlers:

[  233.539363]

[<c014408c>] irq_default_primary_handler threaded [<bf03ff68>] synaptics_rmi4_irq [synaptics_dsx]

[  233.549300] Disabling IRQ #44

 

首先我们顺着错误跟踪linux内核来看下

kernel/irq/spurious.c

 

因此有提示的log信息可以看出,是走的else的分支,bad_action_ret(action_ret)返回为0

通过此函数的dump_stack的信息,可以追溯到调用者

 

drivers/pinctrl/bcm/pinctrl-iproc-gpio.c


kernel/irq/chip.c

handle_level_irq

===> handle_irq_event  (kernel/irq/handle.c)

===> handle_irq_event_percpu   (kernel/irq/handle.c)

===>__handle_irq_event_percpu  (kernel/irq/handle.c)

 

根据log,我们可以在下图看到note_interrupt,即说明noirqdebug=0

Kernel/irq/handle.c

 

因为上面我们已经分析过bad_action_ret(action_ret)返回为0

因此在note_interrupt函数里面只会从如下分支进去

Kernel/irq/spurious.c


从上图可以看出,如果想出现那样的错误,必须满足条件

desc->irqs_unhandled > 99900 为真

如要要满足如上条件的话,那么只有如下地方会让irqs_unhandled++

Kernel/irq/spurious.c


通过上图,我们可以看到,必须满足条件:

action_ret == IRQ_NONE为真

再继续看回如下图,action_ret就是retval

 


res即为action_ret

action->handler的回调函数是:

request_threaded_irq线程化注册中断的第2个参数

kernel/irq/manage.c

因为handlerNULL,所以handler = irq_default_primary_handler

 


action_ret = IRQ_WAKE_THREAD

Kernel/irq/spurious.c