本文通过逆向系统,阅读汇编指令,逐步找到源码,定位到了 iOS 16.0.<iOS 16.2 WKWebView 的系统bug 。同时苹果已经在新版本修复了 Bug,对于巨大的存量用户,仍旧会造成日均 Crash pv 1200+ uv 1000+, 最终通过 Hook 系统行为,规避此 Bug。在手机淘宝双 11 版本中已经彻底修复,Crash 跌 0。
背景
手机淘宝的 Crash 率(Crash+Abort)维持在了 x% 左右一两年的时间了,今年组织又提出了更高的要求,努力把 Crash 再降一降, 我也参与到了其中,我在其中负责几个疑难杂症,有幸定位解决了一些操作系统的 Bug。本文将Crash在 VisionKitCore 的系统 Bug 调研过程以及解决方案记录一下。 
Crash 信息
堆栈特征:
Noteable Address 特征:
额外信息:(观察到都是图文详情)
PS 有水印不方便透出, 额外信息为 改造 KSCrash 附带的当前页面信息。
版本特征:
crash 占比:有堆栈 Crash 第三名
以上简单信息已经可以佐证,首先这大概率是一个操作系统 Bug, 并且由于前期念纪大佬治理了较多业务堆栈问题,这个疑难杂症已经登上了 Crash (有堆栈)的排行榜 Top 3 了,必须要投入解决了。
排查定位
先在苹果论坛搜索了下这个 Crash 堆栈,发现果然有人反馈过这个 Crash。
https://developer.apple.com/forums/thread/718305
发现去年 苹果论坛有人反馈是因为在webview 长按复制图片的逻辑中触发了这个 bug,有位用户反馈了,禁用掉这个 WKWebview 长按手势就可以规避掉这个 Crash (其实不行)。基于以上信息进行测试,并且从 平台 找到一个用户访问的图文详情尝试寻找堆栈。
WKWebView *webview = [[WKWebView alloc] initWithFrame:self.view.bounds];[self.view addSubview:webview];[webview loadRequest:[NSURLRequest requestWithURL:[NSURL URLWithString:@"https://url"]] ];
论坛用户描述的是:禁用长按就不会 crash,但是我测试下来,禁用长按只会让 wkwebview 不创建选择框,但是还是会走创建图片的逻辑,同时手机淘宝的 WebView 容器禁用掉了默认的长按选择框,只实现了一个保存图片的功能,因此这个帖子的解决办法并不能解决手机淘宝的 bug。
刚好今年系统性学习了下 Arm 64 汇编,刚好锻炼下新掌握的知识,从底层找下 Bug、简要堆栈。
Incident Identifier: 9DAC8C95-D65D-4AA2-BF12-D36DC1A7F3B8CrashReporter Key: KSCrash2Hardware Model: iPhone14,2Process: Taobao4iPhone [20565]Path: /private/var/containers/Bundle/Application/36FBCF28-38AA-40B3-8234-EDAE1B3D6611/Taobao4iPhone.app/Taobao4iPhoneIdentifier: TBXDetailViewController|com.taobao.taobao4iphoneVersion: 31863389 (10.27.40)Code Type: ARM-64Parent Process: ? [1]Date/Time: 2023-09-11 21:32:18 +0800Launch Time: 2023-09-11 21:27:01 +0800OS Version: iOS 16.1.1 (20B101)Report Version: 104Exception Type: EXC_BAD_ACCESSException Codes: KERN_INVALID_ADDRESS at 0x0000000173fc8000Exception Subtype: SIGSEGVTriggered by Thread: 99Thread 99 Crashed:0 libsystem_platform.dylib 0x000000021350e930 0x21350e000 + 2352 _platform_memmove :96 (in libsystem_platform.dylib)1 CoreGraphics 0x00000001c8159988 0x1c80f3000 + 420232 _CGDataProviderCreateWithCopyOfData :20 (in CoreGraphics)2 CoreGraphics 0x00000001c8142648 0x1c80f3000 + 325192 _CGBitmapContextCreateImage :216 (in CoreGraphics)3 VisionKitCore 0x0000000208405ad0 0x2083fa000 + 47824 -[VKCRemoveBackgroundResult _createCGImageFromBGRAPixelBuffer:cropRect:] :348 (in VisionKitCore)4 VisionKitCore 0x0000000208405880 0x2083fa000 + 47232 -[VKCRemoveBackgroundResult createCGImage] :156 (in VisionKitCore)5 VisionKitCore 0x000000020849da98 0x2083fa000 + 670360 __vk_cgImageRemoveBackgroundWithDownsizing_block_invoke :64 (in VisionKitCore)6 VisionKitCore 0x0000000208473a5c 0x2083fa000 + 498268 __63-[VKCRemoveBackgroundRequestHandler performRequest:completion:]_block_invoke.5 :436 (in VisionKitCore)7 MediaAnalysisServices 0x0000000209847968 0x209840000 + 31080 __92-[MADService performRequests:onPixelBuffer:withOrientation:andIdentifier:completionHandler:]_block_invoke.38 :400 (in MediaAnalysisServices)8 CoreFoundation 0x00000001c65b8704 0x1c6544000 + 476932 __invoking___ :148 (in CoreFoundation)9 CoreFoundation 0x00000001c6564b6c 0x1c6544000 + 133996 -[NSInvocation invoke] :428 (in CoreFoundation)10 Foundation 0x00000001c09c5b08 0x1c0924000 + 662280 __NSXPCCONNECTION_IS_CALLING_OUT_TO_REPLY_BLOCK__ :16 (in Foundation)11 Foundation 0x00000001c0996ef0 0x1c0924000 + 470768 -[NSXPCConnection _decodeAndInvokeReplyBlockWithEvent:sequence:replyInfo:] :520 (in Foundation)12 Foundation 0x00000001c0f702e4 0x1c0924000 + 6603492 __88-[NSXPCConnection _sendInvocation:orArguments:count:methodSignature:selector:withProxy:]_block_invoke_5 :188 (in Foundation)13 libxpc.dylib 0x0000000213604f1c 0x2135e7000 + 122652 _xpc_connection_reply_callout :124 (in libxpc.dylib)14 libxpc.dylib 0x00000002135f7fb4 0x2135e7000 + 69556 _xpc_connection_call_reply_async :88 (in libxpc.dylib)15 libdispatch.dylib 0x00000001cdb1e05c 0x1cdb1a000 + 16476 _dispatch_client_callout3 :20 (in libdispatch.dylib)16 libdispatch.dylib 0x00000001cdb3bf58 0x1cdb1a000 + 139096 _dispatch_mach_msg_async_reply_invoke :344 (in libdispatch.dylib)17 libdispatch.dylib 0x00000001cdb2556c 0x1cdb1a000 + 46444 _dispatch_lane_serial_drain :376 (in libdispatch.dylib)18 libdispatch.dylib 0x00000001cdb26214 0x1cdb1a000 + 49684 _dispatch_lane_invoke :436 (in libdispatch.dylib)19 libdispatch.dylib 0x00000001cdb30e10 0x1cdb1a000 + 93712 _dispatch_workloop_worker_thread :652 (in libdispatch.dylib)20 libsystem_pthread.dylib 0x00000002135a3df8 0x2135a3000 + 3576 _pthread_wqthread :288 (in libsystem_pthread.dylib)21 libsystem_pthread.dylib 0x00000002135a3b98 0x2135a3000 + 2968 _start_wqthread :8 (in libsystem_pthread.dylib)Thread State: x8:0xe361afd13768009c x9:0xe361afd13768009c lr:0x00000001c8155de0 fp:0x000000016bcae040 x10:0x0000000000000090 x12:0x0000000115204f70 x11:0x000000021cbc2268 x14:0x000000021dfff180 x13:0x000000021dfff160 x16:0x000000021350e8d0 x15:0x00000000e781489a sp:0x000000016bcadfb0 x18:0x0000000000000000 x17:0x000000021e003320 x19:0x0000000000143c00 cpsr:0x0000000020001000 pc:0x000000021350e930 x21:0x0000000000148000 x20:0x0000000173e846c8 x0:0x000000016fe8c6c8 x23:0x000000016fe8c000 x1:0x0000000173fc8000 x22:0x000000016fe8c6c8 x2:0x00000000000002a8 x25:0x0000000000000020 x3:0x000000016ffd0000 x24:0x000000021cbae570 x4:0x0000000003ff8000 x27:0x0000000000000000 x5:0x0000000000000018 x26:0x0000000000000008 x6:0x000000000000002c x7:0x0000000000000000 x28:0x000000028040e180Binary Images:0x0000000104638000 - 0x000000010c70bfff Taobao4iPhone arm64 <23be6181e1c43ce9a6b37d61de01bab3> /private/var/containers/Bundle/Application/36FBCF28-38AA-40B3-8234-EDAE1B3D6611/Taobao4iPhone.app/Taobao4iPhone0x000000021350e000 - 0x0000000213514ff3 libsystem_platform.dylib arm64e <29a26364acef38c28b0ddb0dfca0bb65> /usr/lib/system/libsystem_platform.dylib0x00000001c80f3000 - 0x00000001c8700ff3 CoreGraphics arm64e <ffb3f1e74e3b3ff79d00be32c9d8133c> /System/Library/Frameworks/CoreGraphics.framework/CoreGraphics0x00000002083fa000 - 0x0000000208500fff VisionKitCore arm64e <ce997b5ba4b03818bba22d7f057bc3a2> /System/Library/PrivateFrameworks/VisionKitCore.framework/VisionKitCore0x0000000209840000 - 0x000000020985dfff MediaAnalysisServices arm64e <0c75ee56f3343b8ca96080651906e0dd> /System/Library/PrivateFrameworks/MediaAnalysisServices.framework/MediaAnalysisServices0x00000001c6544000 - 0x00000001c6929fff CoreFoundation arm64e <5cdc5d9ae5063740b64ebb30867b4f1b> /System/Library/Frameworks/CoreFoundation.framework/CoreFoundation0x00000001c0924000 - 0x00000001c126dfff Foundation arm64e <c431acb6fe043d28b6774de6e1c7d81f> /System/Library/Frameworks/Foundation.framework/FoundationNotable Addresses:memory near x0: 0x000000016fe8c678: 0000000000000000 0000000000000000 ................ 0x000000016fe8c688: 0000000000000000 0000000000000000 ................ 0x000000016fe8c698: 0000000000000000 0000000000000000 ................ 0x000000016fe8c6a8: 0000000000000000 0000000000000000 ................ 0x000000016fe8c6b8: 0000000000000000 0000000000000000 ................ ->0x000000016fe8c6c8: 878787a3aaaaaac5 a9a9a9c5b5b5b5d2 ................ 0x000000016fe8c6d8: cbcbcbe7d5d5d5f0 d6d6d6f1d4d4d4f1 ................ 0x000000016fe8c6e8: d4d4d4f1d6d6d6f4 d8d8d8f7d8d8d8f8 ................ [0xf8d8d8d8f7d8d8d8: [objc_object: NSString()]] 0x000000016fe8c6f8: d9d9d9fadadadafb dbdbdbfbdcdcdcfc ................ 0x000000016fe8c708: dededefddfdfdffe dfdfdffedfdfdffe ................ 0x000000016fe8c718: dfdfdffedfdfdffe dfdfdfffe0e0e0ff ................ 0x000000016fe8c728: e0e0e0ffe0e0e0ff e0e0e0ffdfdfdfff ................ [0xffe0e0e0ffe0e0e0: [objc_object: NSString(NATY2cRJ)]] [0xffdfdfdfffe0e0e0: [objc_object: NSString(Hh-e2cRJ)]] 0x000000016fe8c738: e0e0e0ffe0e0e0ff e0e0e0ffe0e0e0ff ................ [0xffe0e0e0ffe0e0e0: [objc_object: NSString(NATY2cRJ)]] [0xffe0e0e0ffe0e0e0: [objc_object: NSString(NATY2cRJ)]] 0x000000016fe8c748: e0e0e0ffe0e0e0ff e0e0e0ffe0e0e0ff ................ [0xffe0e0e0ffe0e0e0: [objc_object: NSString(NATY2cRJ)]] [0xffe0e0e0ffe0e0e0: [objc_object: NSString(NATY2cRJ)]] 0x000000016fe8c758: e0e0e0ffe0e0e0ff e0e0e0ffe0e0e0ff ................ [0xffe0e0e0ffe0e0e0: [objc_object: NSString(NATY2cRJ)]] [0xffe0e0e0ffe0e0e0: [objc_object: NSString(NATY2cRJ)]] memory near x1: 0x0000000173fc7fb0: 0000000000000000 0000000000000000 ................ 0x0000000173fc7fc0: 0000000000000000 0000000000000000 ................ 0x0000000173fc7fd0: 0000000000000000 0000000000000000 ................ 0x0000000173fc7fe0: 0000000000000000 0000000000000000 ................ 0x0000000173fc7ff0: 0000000000000000 0000000000000000 ................ memory near x3: 0x000000016ffcffb0: 0000000000000000 0000000000000000 ................ 0x000000016ffcffc0: 0000000000000000 0000000000000000 ................ 0x000000016ffcffd0: 0000000000000000 0000000000000000 ................ 0x000000016ffcffe0: 0000000000000000 0000000000000000 ................ 0x000000016ffcfff0: 0000000000000000 0000000000000000 ................ ->0x000000016ffd0000: 0000000000000000 0000000000000000 ................ 0x000000016ffd0010: 0000000000000000 0000000000000000 ................ 0x000000016ffd0020: 0000000000000000 0000000000000000 ................ 0x000000016ffd0030: 0000000000000000 0000000000000000 ................ 0x000000016ffd0040: 0000000000000000 0000000000000000 ................ 0x000000016ffd0050: 0000000000000000 0000000000000000 ................ 0x000000016ffd0060: 0000000000000000 0000000000000000 ................ 0x000000016ffd0070: 0000000000000000 0000000000000000 ................ 0x000000016ffd0080: 0000000000000000 0000000000000000 ................ 0x000000016ffd0090: 0000000000000000 0000000000000000 ................图文详情链接:https://xxxx.xx.com

分析关键函数汇编指令

函数调用栈为:
0 libsystem_platform.dylib 0x00000001fb27a930 _platform_memmove :96 (in libsystem_platform.dylib)1 CoreGraphics 0x00000001afec1988 _CGDataProviderCreateWithCopyOfData :20 (in CoreGraphics)2 CoreGraphics 0x00000001afeaa648 _CGBitmapContextCreateImage :216 (in CoreGraphics)3 VisionKitCore 0x00000001f0171ad0 -[VKCRemoveBackgroundResult _createCGImageFromBGRAPixelBuffer:cropRect:] :348 (in VisionKitCore)4 VisionKitCore 0x00000001f0171880 -[VKCRemoveBackgroundResult createCGImage] :156 (in VisionKitCore)5 VisionKitCore 0x00000001f0209a98 __vk_cgImageRemoveBackgroundWithDownsizing_block_invoke :64 (in VisionKitCore)
  1. 基础知识:
  2. Arm 64 调用约定及传参规范(地址:https://developer.arm.com/documentation/den0024/a/The-ABI-for-ARM-64-bit-Architecture/Register-use-in-the-AArch64-Procedure-Call-Standard/Parameters-in-general-purpose-registers)
针对本文,只需要了解到,
  1. x0..x7 是函数调用时传递参数使用到的通用寄存器,分别为第 1 个 到 第 7 个标量参数
    1. v0-v8 是128位浮点计数器,d0-d7 只取 低 8 字节浮点数,用于传递第 1 个 到 第 7 个浮点数参数
    2. x29 为 fp 寄存器,指向栈底
    3. x30 为 lr 寄存器,记录函数调用返回地址
  1. 符号化,必须要选择与出现问题的操作系统一样的版本幸好万瑜 老师手里有一台 iOS 16.1.1 的手机。
libSystem_platform _platform_memmove 分析
__platform_memmove: (x0: dest, x1: src, x2: count)00000001d3d628d0 sub x3, x0, x1 ; x3 = x0 - x1 00000001d3d628d4 cmp x3, x2 ; x3 < x2?00000001d3d628d8 b.lo 0x1d3d62aa0; 看起来是判断 src的尾部 和 dest 有没有重叠 本例没有满足小于00000001d3d628dc mov x3, x0; x3 = x0 00000001d3d628e0 cmp x2, #0x40; x2 - 0x40? 00000001d3d628e4 b.lo 0x1d3d62a7c ; 判断count 有没有小于 0x40, 本例没有满足小于00000001d3d628e8 sub x4, x1, x0 ; x4 = x1 - x0 00000001d3d628ec cmp x4, x2; x4 - x2 ; 看起来是判断 dest 的尾部 和 src 有没有重叠 00000001d3d628f0 b.lo 0x1d3d629b4; 也没有满足00000001d3d628f4 cmp x2, #0x4, lsl #12; 比较 count 是否小于 #0x4000,00000001d3d628f8 b.lo 0x1d3d62958; 本例也没有小于00000001d3d628fc add x3, x3, #0x2000000001d3d62900 and x3, x3, #0xffffffffffffffe000000001d3d62904 ldnp q2, q3, [x1]00000001d3d62908 sub x5, x3, x000000001d3d6290c add x1, x1, x500000001d3d62910 ldnp q0, q1, [x1]00000001d3d62914 add x1, x1, #0x2000000001d3d62918 sub x2, x2, x500000001d3d6291c stnp q2, q3, [x0]00000001d3d62920 subs x2, x2, #0x4000000001d3d62924 b.ls 0x1d3d6294000000001d3d62928 stnp q0, q1, [x3]00000001d3d6292c add x3, x3, #0x2000000001d3d62930 ldnp q0, q1, [x1]; 崩溃第 16 行堆栈 这里 x1 的地址是 0x0000000173fc8000 near x0附近全是 0000
通过分析,可以看到  __platform_memmove,的代码是一个较为常见的 memove 或者memcopy的实现,有一些首尾重叠校验,最终 Crash 的时候 发现 X1 寄存器的内存地址指向了一块数据,这快数据出现了异常。继续往看。
  • _CGDataProviderCreateWithCopyOfData
这里发现 _CGDataProviderCreateWithCopyOfData 地址跳转的是 _create_protected_copy,(⊙o⊙)…
神奇的是 Crash 堆栈里面并没有这个函数调用栈。并且_create_protected_copy 也没有找到任何关于 _platform_memmove 的 b、br、bl 调用 ,难道是这堆栈有点问题?
CoreGraphics`create_protected_copy:-> 0x19c8a1cd0 <+0>: pacibsp 0x19c8a1cd4 <+4>: sub sp, sp, #0xa0 0x19c8a1cd8 <+8>: stp x24, x23, [sp, #0x60] 0x19c8a1cdc <+12>: stp x22, x21, [sp, #0x70] 0x19c8a1ce0 <+16>: stp x20, x19, [sp, #0x80] 0x19c8a1ce4 <+20>: stp x29, x30, [sp, #0x90] 0x19c8a1ce8 <+24>: add x29, sp, #0x90 0x19c8a1cec <+28>: mov x21, #0x0 0x19c8a1cf0 <+32>: cbz x0, 0x19c8a1e5c ; <+396> 0x19c8a1cf4 <+36>: mov x19, x1 0x19c8a1cf8 <+40>: cbz x1, 0x19c8a1e5c ; <+396> 0x19c8a1cfc <+44>: mov x20, x0 0x19c8a1d00 <+48>: adrp x8, 341894 0x19c8a1d04 <+52>: ldr x8, [x8, #0xaa0] 0x19c8a1d08 <+56>: ldr x8, [x8] 0x19c8a1d0c <+60>: cmp x8, x19 0x19c8a1d10 <+64>: b.ls 0x19c8a1d48 ; <+120> 0x19c8a1d14 <+68>: mov x0, #0x0 0x19c8a1d18 <+72>: mov x1, x20 0x19c8a1d1c <+76>: mov x2, x19 0x19c8a1d20 <+80>: ldp x29, x30, [sp, #0x90] 0x19c8a1d24 <+84>: ldp x20, x19, [sp, #0x80] 0x19c8a1d28 <+88>: ldp x22, x21, [sp, #0x70] 0x19c8a1d2c <+92>: ldp x24, x23, [sp, #0x60] 0x19c8a1d30 <+96>: add sp, sp, #0xa0 0x19c8a1d34 <+100>: autibsp 0x19c8a1d38 <+104>: eor x16, x30, x30, lsl #1 0x19c8a1d3c <+108>: tbz x16, #0x3e, 0x19c8a1d44 ; <+116> 0x19c8a1d40 <+112>: brk #0xc471 0x19c8a1d44 <+116>: b 0x1a076cd80 0x19c8a1d48 <+120>: neg x9, x8 0x19c8a1d4c <+124>: and x22, x9, x20 0x19c8a1d50 <+128>: add x10, x19, x20 0x19c8a1d54 <+132>: add x8, x10, x8 0x19c8a1d58 <+136>: sub x8, x8, #0x1 0x19c8a1d5c <+140>: and x8, x8, x9 0x19c8a1d60 <+144>: sub x21, x8, x22 0x19c8a1d64 <+148>: mov x0, #0x0 0x19c8a1d68 <+152>: mov x1, x21 0x19c8a1d6c <+156>: mov w2, #0x3 0x19c8a1d70 <+160>: mov w3, #0x1002 0x19c8a1d74 <+164>: mov w4, #0x36000000 0x19c8a1d78 <+168>: mov x5, #0x0 0x19c8a1d7c <+172>: bl 0x1a076e5c0 0x19c8a1d80 <+176>: cmn x0, #0x1 0x19c8a1d84 <+180>: b.eq 0x19c8a1e58 ; <+392> 0x19c8a1d88 <+184>: mov x23, x0 0x19c8a1d8c <+188>: adrp x24, 341894 0x19c8a1d90 <+192>: ldr x24, [x24, #0xa80] 0x19c8a1d94 <+196>: ldr w0, [x24] 0x19c8a1d98 <+200>: mov x1, x22 0x19c8a1d9c <+204>: mov x2, x21 0x19c8a1da0 <+208>: mov x3, x23 0x19c8a1da4 <+212>: bl 0x1a076ecd0 0x19c8a1da8 <+216>: sub x8, x20, x22 0x19c8a1dac <+220>: add x22, x8, x23 0x19c8a1db0 <+224>: cbz w0, 0x19c8a1de0 ; <+272> 0x19c8a1db4 <+228>: adrp x8, 1405 0x19c8a1db8 <+232>: add x8, x8, #0xed9 ; "copy_read_only" 0x19c8a1dbc <+236>: stp x8, x0, [sp] 0x19c8a1dc0 <+240>: adrp x1, 1405 0x19c8a1dc4 <+244>: add x1, x1, #0xeba ; "%s: vm_copy failed: status %d." 0x19c8a1dc8 <+248>: mov w0, #0x0 0x19c8a1dcc <+252>: bl 0x19cb1ffcc ; CGLog 0x19c8a1dd0 <+256>: mov x0, x22 0x19c8a1dd4 <+260>: mov x1, x20 0x19c8a1dd8 <+264>: mov x2, x19 0x19c8a1ddc <+268>: bl 0x19cc48f80 ; symbol stub for: memcpy 0x19c8a1de0 <+272>: ldr w0, [x24] 0x19c8a1de4 <+276>: mov x1, x22 0x19c8a1de8 <+280>: mov x2, x19 0x19c8a1dec <+284>: mov w3, #0x1 0x19c8a1df0 <+288>: mov w4, #0x1 0x19c8a1df4 <+292>: bl 0x1a076ecf0 0x19c8a1df8 <+296>: cbz x22, 0x19c8a1e58 ; <+392> 0x19c8a1dfc <+300>: cmp x22, x20 0x19c8a1e00 <+304>: b.eq 0x19c8a1d14 ; <+68> 0x19c8a1e04 <+308>: movi.2d v0, #0000000000000000 0x19c8a1e08 <+312>: stp q0, q0, [sp, #0x30] 0x19c8a1e0c <+316>: stp q0, q0, [sp, #0x10] 0x19c8a1e10 <+320>: str x21, [sp, #0x18] 0x19c8a1e14 <+324>: adrp x16, -67 0x19c8a1e18 <+328>: add x16, x16, #0x544 ; vm_allocator_deallocate 0x19c8a1e1c <+332>: paciza x16 0x19c8a1e20 <+336>: stp x16, xzr, [sp, #0x48] 0x19c8a1e24 <+340>: add x1, sp, #0x10 0x19c8a1e28 <+344>: mov x0, #0x0 0x19c8a1e2c <+348>: bl 0x1a076cb00 0x19c8a1e30 <+352>: mov x20, x0 0x19c8a1e34 <+356>: mov x0, #0x0 0x19c8a1e38 <+360>: mov x1, x22 0x19c8a1e3c <+364>: mov x2, x19 0x19c8a1e40 <+368>: mov x3, x20 0x19c8a1e44 <+372>: bl 0x1a076cdb0 0x19c8a1e48 <+376>: mov x21, x0 0x19c8a1e4c <+380>: mov x0, x20 0x19c8a1e50 <+384>: bl 0x1a076d200 0x19c8a1e54 <+388>: b 0x19c8a1e5c ; <+396> 0x19c8a1e58 <+392>: mov x21, #0x0 0x19c8a1e5c <+396>: mov x0, x21 0x19c8a1e60 <+400>: ldp x29, x30, [sp, #0x90] 0x19c8a1e64 <+404>: ldp x20, x19, [sp, #0x80] 0x19c8a1e68 <+408>: ldp x22, x21, [sp, #0x70] 0x19c8a1e6c <+412>: ldp x24, x23, [sp, #0x60] 0x19c8a1e70 <+416>: add sp, sp, #0xa0 0x19c8a1e74 <+420>: retab
  • Arm64 Crash 堆栈解析
最后经过请教了大佬同事,补充了一个知识盲区,x86_64的调用约定里面强制要求函数调用时需要将 pc 的下一行地址(返回地址)入栈,因此只需要遍历栈即可获取正确的函数调用栈。
但 Arm 64 体系结构中使用 LR 寄存器存放函数返回地址,如果当前函数也需要调用其他函数,就需要再 prolog 里面保存 lr 寄存器的地址。这也是大家经常在函数调用栈开始看到的模版代码:
WKCopy`+[ViewController load]: 0x100b0c000 <+0>: sub sp, sp, #0x20 // 栈增长 0x100b0c004 <+4>: stp x29, x30, [sp, #0x10] // 旧 lr 和 fp 存栈 0x100b0c008 <+8>: add x29, sp, #0x10 // fp 指向 新的栈底 do some thing 0x100b0c020 <+32>: ldp x29, x30, [sp, #0x10] // 恢复 旧的 lr 和 fp 0x100b0c024 <+36>: add sp, sp, #0x20 // 栈缩小 0x100b0c028 <+40>: ret // 返回上个调用栈
但是由于并不是所有函数都使用栈,这类函数叫 FrameLess 函数。比如 memset. memove memcpy 这类函数通常的逻辑都是 通过一个来源地址,每次拷贝一部分数据到寄存器,然后再从寄存器复制到目标地址中,并且地址长度增长到某个长度截止。
同时 Arm64 中还有一类不返回跳转指令,比如 b/br 一般用于桩指令。
在一些尾递归场景中为了省去不必要的返回(当函数发现我调用下一个函数没必要回来)也会直接使用 b 指令来进行优化。其实最常见的就是 msg_send 既用到了尾调用优化,又是 frameless 函数。
当进程 Crash 时,KSCrash 会对函数调用堆栈进行回溯如果函数是 FrameLess函数,规则会有一定细节处理具体来说就是:
1. 崩溃当前函数,直接用 pc 地址,获取最后一个函数栈帧,获取起始范围,
2. 遍历 上一个函数栈,通过 ldp fp, lr, x29 取出来 lr 计算函数栈
3. 递归执行2,当lr执行到0的时候,证明到了 线程启动函数,终止。
代码见 KSStackCursor
但会有个场景 frameless function + b + frameless function crash, 导致堆栈看起来丢失。以本文为例,在这个里面丢失了两行堆栈原因是因为:
1. memmove 是一个 尾调用优化,因此再尾调用优化的自身就丢失了,这确实是正常的
2. platformmemove 是一个 frameless 函数,因此它没有保存栈的逻辑,取出来的栈上的lr其实是 _create_protected_copy 的函数栈,因为自己都是无栈的,所以丢失了 lr。碰见这种函数可以从 lr 地址里面去看函数地址。
所以本文其实真正的调用堆栈是:
0 libsystem_platform.dylib 0x00000001fb27a930 _platform_memmove :96 (in libsystem_platform.dylib) 丢失的堆栈2 _memcpy 丢失的堆栈1:_create_protected_copy 1 CoreGraphics 0x00000001afec1988 _CGDataProviderCreateWithCopyOfData :20 (in CoreGraphics)2 CoreGraphics 0x00000001afeaa648 _CGBitmapContextCreateImage :216 (in CoreGraphics)3 VisionKitCore 0x00000001f0171ad0 -[VKCRemoveBackgroundResult _createCGImageFromBGRAPixelBuffer:cropRect:] :348 (in VisionKitCore)4 VisionKitCore 0x00000001f0171880 -[VKCRemoveBackgroundResult createCGImage] :156 (in VisionKitCore)5 VisionKitCore 0x00000001f0209a98 __vk_cgImageRemoveBackgroundWithDownsizing_block_invoke :64 (in VisionKitCore)
接着看:
VisionKitCore`-[VKCRemoveBackgroundResult _createCGImageFromBGRAPixelBuffer:cropRect:]:
VisionKitCore`-[VKCRemoveBackgroundResult _createCGImageFromBGRAPixelBuffer:cropRect:]: 0x1dcb51974 <+0>: cbz x0, 0x1dcb51b98 ; <+548> 0x1dcb51978 <+4>: pacibsp 0x1dcb5197c <+8>: sub sp, sp, #0x90 0x1dcb51980 <+12>: stp d11, d10, [sp, #0x10] 0x1dcb51984 <+16>: stp d9, d8, [sp, #0x20] 0x1dcb51988 <+20>: stp x28, x27, [sp, #0x30] 0x1dcb5198c <+24>: stp x26, x25, [sp, #0x40] 0x1dcb51990 <+28>: stp x24, x23, [sp, #0x50] 0x1dcb51994 <+32>: stp x22, x21, [sp, #0x60] 0x1dcb51998 <+36>: stp x20, x19, [sp, #0x70] 0x1dcb5199c <+40>: stp x29, x30, [sp, #0x80] 0x1dcb519a0 <+44>: add x29, sp, #0x80 0x1dcb519a4 <+48>: fmov d8, d3 0x1dcb519a8 <+52>: fmov d9, d2 0x1dcb519ac <+56>: fmov d10, d1 0x1dcb519b0 <+60>: fmov d11, d0 0x1dcb519b4 <+64>: mov x19, x2 0x1dcb519b8 <+68>: mov x0, x2 0x1dcb519bc <+72>: mov w1, #0x1 0x1dcb519c0 <+76>: bl 0x1dda62ce0 CVPixelBufferLockBaseAddress(cvpixbuffer) 0x1dcb519c4 <+80>: mov x0, x19 0x1dcb519c8 <+84>: bl 0x1dda62c90 CVPixelBufferGetBaseAddress 0x1dcb519cc <+88>: mov x21, x0 x21 = pix = address 0x1dcb519d0 <+92>: mov x0, x19 0x1dcb519d4 <+96>: bl 0x1dda62cc0 CVPixelBufferGetPixelFormatType(cvpixbuffer) 0x1dcb519d8 <+100>: mov x1, x0 x1 = pixformatdesc 0x1dcb519dc <+104>: mov x0, #0x0 0x1dcb519e0 <+108>: bl 0x1dda62d20 CVPixelFormatDescriptionCreateWithPixelFormatType(NULL: allocator, pixformatdesc) 0x1dcb519e4 <+112>: bl 0x1dda632f0 autorelease 0x1dcb519e8 <+116>: mov x20, x0 x0 : __NSFrozenDictionaryM = format desc 0x1dcb519ec <+120>: cbz x0, 0x1dcb51b20 ; <+428> 0x1dcb519f0 <+124>: cbz x21, 0x1dcb51b5c; <+488>校验是否为空 baseAddress pixformatdesc is nil 0x1dcb519f4 <+128>: fcvtmu x22, d9; 类型转换 x22 = d9 = d3 height 0x1dcb519f8 <+132>: fcvtmu x23, d8 x23 = width 0x1dcb519fc <+136>: adrp x8, 64224 0x1dcb51a00 <+140>: ldr x8, [x8, #0x958] 0x1dcb51a04 <+144>: ldr x2, [x8] 0x1dcb51a08 <+148>: mov x0, x20 0x1dcb51a0c <+152>: bl 0x1dcc41360 pifornmar[BitsPerBlock] is 32 ; objc_msgSend$objectForKeyedSubscript: 0x1dcb51a10 <+156>: bl 0x1dda632f0 autorelease 0x1dcb51a14 <+160>: mov x24, x0; 0x1dcb51a18 <+164>: bl 0x1dcc3ee00 [x0 integervalue]; objc_msgSend$integerValue 0x1dcb51a1c <+168>: mov x25, x0 ; x25 = bitsperBlock = 32 0x1dcb51a20 <+172>: bl 0x1dda63450 x0 release 0x1dcb51a24 <+176>: adrp x2, 102753 0x1dcb51a28 <+180>: add x2, x2, #0x9e0; @"BitsPerComponent" 0x1dcb51a2c <+184>: mov x0, x20 0x1dcb51a30 <+188>: bl 0x1dcc41360 pifornmar[BitsPerComponent]; objc_msgSend$objectForKeyedSubscript: 0x1dcb51a34 <+192>: bl 0x1dda632f0 autorelease 0x1dcb51a38 <+196>: mov x24, x0 0x1dcb51a3c <+200>: bl 0x1dcc3ee00; objc_msgSend$integerValue 0x1dcb51a40 <+204>: mov x26, x0 ; x26 = 8 通道 0x1dcb51a44 <+208>: bl 0x1dda63450 x0 release 0x1dcb51a48 <+212>: fmov d0, d11 0x1dcb51a4c <+216>: fmov d1, d10 0x1dcb51a50 <+220>: fmov d2, d9 0x1dcb51a54 <+224>: fmov d3, d8 0x1dcb51a58 <+228>: bl 0x1dda62af0 CGRectGetMinX 0x1dcb51a5c <+232>: fcvtmu x24, d0 x24 = minX 0x1dcb51a60 <+236>: fmov d0, d11 0x1dcb51a64 <+240>: fmov d1, d10 0x1dcb51a68 <+244>: fmov d2, d9 0x1dcb51a6c <+248>: fmov d3, d8 0x1dcb51a70 <+252>: bl 0x1dda62b00 CGRectGetMinY 0x1dcb51a74 <+256>: fcvtmu x27, d0 ; x27 = minY 0x1dcb51a78 <+260>: lsr x8, x25, #3 ; 右移三位 x8 = 4了 0x1dcb51a7c <+264>: madd x21, x8, x24, x21 (4 * x24) + baseAddress 0x1dcb51a80 <+268>: mov x0, x19 0x1dcb51a84 <+272>: bl 0x1dda62ca0 CVPixelBufferGetBytesPerRow(pifbuffer, ) 0x1dcb51a88 <+276>: madd x21, x0, x27, x21 (byterPerfow * minY ) + baseAddress 0x1dcb51a8c <+280>: adrp x8, 64219 0x1dcb51a90 <+284>: ldr x8, [x8, #0xb60] 0x1dcb51a94 <+288>: ldr x0, [x8] 0x1dcb51a98 <+292>: bl 0x1dda627e0 CGColorSpaceCreateWithName(kCGColorSpaceSRGB) 0x1dcb51a9c <+296>: mov x24, x0 x24 = <CGColorSpace 0x28121fe40> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1) 0x1dcb51aa0 <+300>: mov x0, x19; 0x1dcb51aa4 <+304>: bl 0x1dda62ca0 CVPixelBufferGetBytesPerRow(pixbuffer) 0x1dcb51aa8 <+308>: mov x4, x0 x4 = btresPerrow 0x1dcb51aac <+312>: mov x0, x21 0x1dcb51ab0 <+316>: mov x1, x22 0x1dcb51ab4 <+320>: mov x2, x23 0x1dcb51ab8 <+324>: mov x3, x26 0x1dcb51abc <+328>: mov x5, x24 0x1dcb51ac0 <+332>: mov w6, #0x2002 0x1dcb51ac4 <+336>: bl 0x1dda62700 CGBitmapContextCreate(baseAddress, width, height ,BitsPerComponent(8), btresPerrow, minX, 0x2002) 0x1dcb51ac8 <+340>: mov x21, x0;; x0 = bitMap 0x1dcb51acc <+344>: bl 0x1dda62710; _CGBitmapContextCreateImage // 0x1dcb51ad0 <+348>: mov x22, x0 0x1dcb51ad4 <+352>: mov x0, x21 0x1dcb51ad8 <+356>: bl 0x1dda62850 0x1dcb51adc <+360>: mov x0, x24 0x1dcb51ae0 <+364>: bl 0x1dda62810 0x1dcb51ae4 <+368>: mov x0, x19 0x1dcb51ae8 <+372>: mov w1, #0x1 0x1dcb51aec <+376>: bl 0x1dda62d10 0x1dcb51af0 <+380>: bl 0x1dda63410 0x1dcb51af4 <+384>: mov x0, x22 0x1dcb51af8 <+388>: ldp x29, x30, [sp, #0x80] 0x1dcb51afc <+392>: ldp x20, x19, [sp, #0x70] 0x1dcb51b00 <+396>: ldp x22, x21, [sp, #0x60] 0x1dcb51b04 <+400>: ldp x24, x23, [sp, #0x50] 0x1dcb51b08 <+404>: ldp x26, x25, [sp, #0x40] 0x1dcb51b0c <+408>: ldp x28, x27, [sp, #0x30] 0x1dcb51b10 <+412>: ldp d9, d8, [sp, #0x20] 0x1dcb51b14 <+416>: ldp d11, d10, [sp, #0x10] 0x1dcb51b18 <+420>: add sp, sp, #0x90 0x1dcb51b1c <+424>: retab 0x1dcb51b20 <+428>: adrp x8, 76725 0x1dcb51b24 <+432>: ldr x0, [x8, #0xae8] 0x1dcb51b28 <+436>: adrp x8, 176 0x1dcb51b2c <+440>: add x8, x8, #0xc38 ; "pixelFormatDict" 0x1dcb51b30 <+444>: str x8, [sp] 0x1dcb51b34 <+448>: adrp x2, 176 0x1dcb51b38 <+452>: add x2, x2, #0xbb4 ; "((pixelFormatDict) != nil)" 0x1dcb51b3c <+456>: adrp x3, 176 0x1dcb51b40 <+460>: add x3, x3, #0xbcf ; "-[VKCRemoveBackgroundResult _createCGImageFromBGRAPixelBuffer:cropRect:]" 0x1dcb51b44 <+464>: adrp x6, 102753 0x1dcb51b48 <+468>: add x6, x6, #0x9c0 ; @"Expected non-nil value for '%s'" 0x1dcb51b4c <+472>: mov w4, #0x0 0x1dcb51b50 <+476>: mov w5, #0x0 0x1dcb51b54 <+480>: bl 0x1dcc3d080 ; objc_msgSend$handleFailedAssertWithCondition:functionName:simulateCrash:showAlert:format: 0x1dcb51b58 <+484>: cbnz x21, 0x1dcb519f4 ; <+128> 0x1dcb51b5c <+488>: adrp x8, 76725 0x1dcb51b60 <+492>: ldr x0, [x8, #0xae8] 0x1dcb51b64 <+496>: adrp x8, 176 0x1dcb51b68 <+500>: add x8, x8, #0xc65 ; "bufferBaseAddress" 0x1dcb51b6c <+504>: str x8, [sp] 0x1dcb51b70 <+508>: adrp x2, 176 0x1dcb51b74 <+512>: add x2, x2, #0xc48 ; "((bufferBaseAddress) != nil)" 0x1dcb51b78 <+516>: adrp x3, 176 0x1dcb51b7c <+520>: add x3, x3, #0xbcf ; "-[VKCRemoveBackgroundResult _createCGImageFromBGRAPixelBuffer:cropRect:]" 0x1dcb51b80 <+524>: adrp x6, 102753 0x1dcb51b84 <+528>: add x6, x6, #0x9c0 ; @"Expected non-nil value for '%s'" 0x1dcb51b88 <+532>: mov w4, #0x0 0x1dcb51b8c <+536>: mov w5, #0x0 0x1dcb51b90 <+540>: bl 0x1dcc3d080 ; objc_msgSend$handleFailedAssertWithCondition:functionName:simulateCrash:showAlert:format: 0x1dcb51b94 <+544>: b 0x1dcb519f4 ; <+128> 0x1dcb51b98 <+548>: ret
伪代码如下:
CVReturn return1 = CVPixelBufferLockBaseAddress(cvpixbuffer,YES) pixaddress = CVPixelBufferGetBaseAddress(cvpixbuffer) formatType = CVPixelBufferGetPixelFormatType(cvpixbuffer) formatDesc: Dictionary = CVPixelFormatDescriptionCreateWithPixelFormatType(NULL, formatType) if pixaddress == nil || formatDesc { go other logic } bitsPerBlock = [formatDesc[bitsperBlock] integerValue] bitsPerComponent = [formatDesc[BitsPerComponent] integerValue]minx = CGRectGetMinX(x, y, w, h)miny = CGRectGetMinY(x, y, w, h)baseaddress = (4 * minx) + baseAddresssize byterPerRow = CVPixelBufferGetBytesPerRow(pifbuffer) baseAddress (byterPerRow * minY ) + baseAddress colorSpace = CGColorSpaceCreateWithName(kCGColorSpaceSRGB) size byterPerRow = CVPixelBufferGetBytesPerRow(pifbuffer)bitMap = CGBitmapContextCreate(baseAddress, width, height ,BitsPerComponent(8), btresPerrow, minX,,colorspace, 0x2002)CGBitmapContextCreateImage(bitMap)
VisionKitCore`-[VKCRemoveBackgroundResult createCGImage]:
VisionKitCore`-[VKCRemoveBackgroundResult createCGImage]: 0x1dcb517e4 <+0>: pacibsp 0x1dcb517e8 <+4>: sub sp, sp, #0x60 0x1dcb517ec <+8>: stp d11, d10, [sp, #0x10] 0x1dcb517f0 <+12>: stp d9, d8, [sp, #0x20] 0x1dcb517f4 <+16>: stp x22, x21, [sp, #0x30] 0x1dcb517f8 <+20>: stp x20, x19, [sp, #0x40] 0x1dcb517fc <+24>: stp x29, x30, [sp, #0x50] 0x1dcb51800 <+28>: add x29, sp, #0x50 0x1dcb51804 <+32>: mov x20, x0 0x1dcb51808 <+36>: bl 0x1dcc41b00 ; objc_msgSend$pixelBuffer 0x1dcb5180c <+40>: mov x19, x0 ; <CVPixelBuffer 0x283708dc0 width=1179 height=1825 bytesPerRow=4736 pixelFormat=BGRA iosurface=0x280214df0 poolName=CoreVideo attributes={ 0x1dcb51810 <+44>: mov x0, x20; <VKCRemoveBackgroundResult: 0x282b3cd90> 0x1dcb51814 <+48>: bl 0x1dcc3ac00 ; objc_msgSend$cropRect // 获取截图区域 d0 = 69.08203125 d1 = 349.31640625 d2 = 1045.44140625 d3 = 741.40625 0x1dcb51818 <+52>: fmov d11, d0 0x1dcb5181c <+56>: fmov d10, d1 0x1dcb51820 <+60>: fmov d9, d2 0x1dcb51824 <+64>: fmov d8, d3 0x1dcb51828 <+68>: cbz x19, 0x1dcb51888; <+164> 判断pixbuffer是否为nil 0x1dcb5182c <+72>: fmov d0, d11 0x1dcb51830 <+76>: fmov d1, d10 0x1dcb51834 <+80>: fmov d2, d9 0x1dcb51838 <+84>: fmov d3, d8 0x1dcb5183c <+88>: bl 0x1dcb4cecc ; VKMRectHasArea 0x1dcb51840 <+92>: cbz w0, 0x1dcb51888 ; <+164> 是否在区域内 0x1dcb51844 <+96>: mov w22, #0x5241 0x1dcb51848 <+100>: movk w22, #0x4247, lsl #16 0x1dcb5184c <+104>: mov x0, x19 0x1dcb51850 <+108>: bl 0x1dda62d00; CVPixelBufferRetain , CVPixelBufferUnlockBaseAddress 0x1dcb51854 <+112>: mov x0, x19 0x1dcb51858 <+116>: bl 0x1dda62cc0; CVPixelBufferGetPixelFormatType 0x1dcb5185c <+120>: cmp w0, w22; 不同就返回 0x1dcb51860 <+124>: b.ne 0x1dcb518e4 CVPixelBufferGetWidth ; <+256> 0x1dcb51864 <+128>: mov x0, x20 0x1dcb51868 <+132>: mov x2, x19 0x1dcb5186c <+136>: fmov d0, d11 0x1dcb51870 <+140>: fmov d1, d10 0x1dcb51874 <+144>: fmov d2, d9 0x1dcb51878 <+148>: fmov d3, d8 x0: self, x1: selector x2 cvpixbuffer x d0 d0 = 69.08203125 d1 = 349.31640625 d2 = 1045.44140625 d3 = 741.40625 0x1dcb5187c <+152>: bl 0x1dcb51974 ; -[VKCRemoveBackgroundResult _createCGImageFromBGRAPixelBuffer:cropRect:]-> 0x1dcb51880 <+156>: mov x20, x0 0x1dcb51884 <+160>: b 0x1dcb5194c ; <+360> 0x1dcb51888 <+164>: adrp x8, 76725 0x1dcb5188c <+168>: ldr x20, [x8, #0xae8] 0x1dcb51890 <+172>: fmov d0, d11 0x1dcb51894 <+176>: fmov d1, d10 0x1dcb51898 <+180>: fmov d2, d9 0x1dcb5189c <+184>: fmov d3, d8 0x1dcb518a0 <+188>: bl 0x1dcbd1834 ; VKMUIStringForRect 0x1dcb518a4 <+192>: bl 0x1dda632f0 0x1dcb518a8 <+196>: mov x21, x0 0x1dcb518ac <+200>: stp x19, x0, [sp] 0x1dcb518b0 <+204>: adrp x2, 176 0x1dcb518b4 <+208>: add x2, x2, #0xaf3 ; "__objc_no" 0x1dcb518b8 <+212>: adrp x3, 176 0x1dcb518bc <+216>: add x3, x3, #0xafd ; "-[VKCRemoveBackgroundResult createCGImage]" 0x1dcb518c0 <+220>: adrp x6, 102753 0x1dcb518c4 <+224>: add x6, x6, #0x9a0 ; @"CreateCGImage is buffer incorrect, buffer: %@, cropRect:%@" 0x1dcb518c8 <+228>: mov x0, x20 0x1dcb518cc <+232>: mov w4, #0x0 0x1dcb518d0 <+236>: mov w5, #0x0 0x1dcb518d4 <+240>: bl 0x1dcc3d080 ; objc_msgSend$handleFailedAssertWithCondition:functionName:simulateCrash:showAlert:format: 0x1dcb518d8 <+244>: bl 0x1dda63420 0x1dcb518dc <+248>: mov x20, #0x0 0x1dcb518e0 <+252>: b 0x1dcb51954 ; <+368> 0x1dcb518e4 <+256>: mov x21, x0 0x1dcb518e8 <+260>: adrp x8, 76725 0x1dcb518ec <+264>: ldr x20, [x8, #0xae8] 0x1dcb518f0 <+268>: mov w0, #0x5241 0x1dcb518f4 <+272>: movk w0, #0x4247, lsl #16 0x1dcb518f8 <+276>: bl 0x1dcbd1b80 ; VKMUIStringForCVPixelBufferType 0x1dcb518fc <+280>: bl 0x1dda632f0 0x1dcb51900 <+284>: mov x22, x0 0x1dcb51904 <+288>: mov x0, x21 0x1dcb51908 <+292>: bl 0x1dcbd1b80 ; VKMUIStringForCVPixelBufferType 0x1dcb5190c <+296>: bl 0x1dda632f0 0x1dcb51910 <+300>: mov x21, x0 0x1dcb51914 <+304>: stp x22, x0, [sp] 0x1dcb51918 <+308>: adrp x2, 176 0x1dcb5191c <+312>: add x2, x2, #0xaf3 ; "__objc_no" 0x1dcb51920 <+316>: adrp x3, 176 0x1dcb51924 <+320>: add x3, x3, #0xafd ; "-[VKCRemoveBackgroundResult createCGImage]" 0x1dcb51928 <+324>: adrp x6, 102753 0x1dcb5192c <+328>: add x6, x6, #0x980 ; @"Pixel format for createCGImage is incorrect, expected: %@, received: %@. Bailing" 0x1dcb51930 <+332>: mov x0, x20 0x1dcb51934 <+336>: mov w4, #0x0 0x1dcb51938 <+340>: mov w5, #0x0 0x1dcb5193c <+344>: bl 0x1dcc3d080 ; objc_msgSend$handleFailedAssertWithCondition:functionName:simulateCrash:showAlert:format: 0x1dcb51940 <+348>: bl 0x1dda63420 0x1dcb51944 <+352>: bl 0x1dda63430 0x1dcb51948 <+356>: mov x20, #0x0 0x1dcb5194c <+360>: mov x0, x19 0x1dcb51950 <+364>: bl 0x1dda62cf0 0x1dcb51954 <+368>: mov x0, x20 0x1dcb51958 <+372>: ldp x29, x30, [sp, #0x50] 0x1dcb5195c <+376>: ldp x20, x19, [sp, #0x40] 0x1dcb51960 <+380>: ldp x22, x21, [sp, #0x30] 0x1dcb51964 <+384>: ldp d9, d8, [sp, #0x20] 0x1dcb51968 <+388>: ldp d11, d10, [sp, #0x10] 0x1dcb5196c <+392>: add sp, sp, #0x60 0x1dcb51970 <+396>: retab
伪代码逻辑:
cvpixbuffer = [VKCRemoveBackgroundResult pifbuffer]getCropRect = [VKCRemoveBackgroundResult crioRect]if cvpixbuffer ==nil || !VKMRectHasArea(getCropRect) { go other logic }[cvpixbuffer retain ]VKCRemoveBackgroundResult: _createCGImageFromBGRAPixelBuffer: pixbuffer: cropRect: cropRect]
由以上逻辑可以看到系统在 WKWebview 里面长按的逻辑是这样实现的:
WKWebview 跨进程访问了 从BitMap 里面截取了一个图片,并且传递给 VisionKitCore,然后 VisionKit 直接从这个区域获取了 buffer 然后创建了一张图片做一些行为。但是具体为什么 Crash 这时候已经很难排查, 因为这个 bitmap 的对象其实是很早创建的,只是在这里消费的时候挂掉了,有可能是因为提前释放,有可能是野指针,有可能是越界了~~ 因此尝试从其他地方找一些蛛丝马迹。
对比下各版本操作系统
既然线上观察到 iOS 16.2 以上就不会出现 Crash了,那可能真的是系统 Bug ,并且偷偷摸摸解决了。于是寻找几台高版本的手机进行实验。

iOS 16.2 

长按 webview 后, __vk_cgImageRemoveBackgroundWithDownsizing_block_invoke函数传递过来的 x1 是 nil,而且针对 VKCRemoveBackgroundResult 所有符号打符号断点,发现长按webview时,不会命中任何逻辑。彻底和 iOS 16.1.1 的设备逻辑不一致了。

iOS 17 

到了iOS 17 后又不一样了,VisionKitCore-[VKCRemoveBackgroundResult _createCGImageFromBGRAPixelBuffer:cropRect:]:改成了直接调用 visionkit 里面的 vk_cgImageFromPixelBuffer 创建。
VisionKitCore`-[VKCRemoveBackgroundResult _createCGImageFromBGRAPixelBuffer:cropRect:]:-> 0x204396388 <+0>: cbz x0, 0x2043963f8 ; <+112> 0x20439638c <+4>: pacibsp 0x204396390 <+8>: stp d11, d10, [sp, #-0x40]! 0x204396394 <+12>: stp d9, d8, [sp, #0x10] 0x204396398 <+16>: stp x20, x19, [sp, #0x20] 0x20439639c <+20>: stp x29, x30, [sp, #0x30] 0x2043963a0 <+24>: add x29, sp, #0x30 0x2043963a4 <+28>: fmov d8, d3 0x2043963a8 <+32>: fmov d9, d2 0x2043963ac <+36>: fmov d10, d1 0x2043963b0 <+40>: fmov d11, d0 0x2043963b4 <+44>: mov x0, x1 0x2043963b8 <+48>: bl 0x20444d4e8 ; vk_cgImageFromPixelBuffer 0x2043963bc <+52>: mov x19, x0 0x2043963c0 <+56>: fmov d0, d11 0x2043963c4 <+60>: fmov d1, d10 0x2043963c8 <+64>: fmov d2, d9 0x2043963cc <+68>: fmov d3, d8 0x2043963d0 <+72>: bl 0x206acc070 0x2043963d4 <+76>: mov x20, x0 0x2043963d8 <+80>: mov x0, x19 0x2043963dc <+84>: bl 0x206acc110 0x2043963e0 <+88>: mov x0, x20 0x2043963e4 <+92>: ldp x29, x30, [sp, #0x30] 0x2043963e8 <+96>: ldp x20, x19, [sp, #0x20] 0x2043963ec <+100>: ldp d9, d8, [sp, #0x10] 0x2043963f0 <+104>: ldp d11, d10, [sp], #0x40 0x2043963f4 <+108>: retab 0x2043963f8 <+112>: ret

iOS 16.1.1 

blockInvoke 的时候也就是说 x1 一定是有值的,因此会走调用逻辑。
看看这个图片到底有什么用?
看上去绘制了一个低分辨率的缩略图,不知道有啥用。
继续看 :
看起来是回调到了 webkit,那webkit 是开源的,继续看——
找到对应设备存在的Webkit版本号:
代码在 ImageAnalysisUtilities.mm(地址:https://github.com/WebKit/WebKit/blob/releases/Apple/Safari-16.1-iOS-16.1.1/Source/WebKit/Platform/cocoa/ImageAnalysisUtilities.mm)
看上去做图像识别的,但是还不确定,继续搜谁调用了它 ,Github目前能直接搜索符号
基本确认是做图像物体识别的,并且有额外判断逻辑,没有 image 就 return。
WebContextMenuProxyMax.mm(地址:https://github.com/WebKit/WebKit/blob/main/Source/WebKit/UIProcess/mac/WebContextMenuProxyMac.mm#L334)
解决方案
基于前面的原因得到一些初步的结论:这个功能是 iOS 16 新增的Feature,也就是图像识别,在iOS 16中,系统相册也可以长按抠图,同时 系统直接给 WKWebview 里面的所有图片都增加了这个功能。
  1. iOS 16.0..<16.2 期间的所有版本都是有隐含 Bug 的。并不是开发者造成的
  2. _memmove. platformmemory 是非常底层常用的 API,不可能是这的问题。
  3. 大概率是 WKWebview 使用方式导致的,或者是 VisionKit 抠图能力有 Bug。但是由于多次异步加 XPC 调度已经很难确认。

第一种解决方案

我突然想到,既然是默认的行为,那是不是去掉这个行为就好了,同时在前面的的调用栈发现,当 -[VKCRemoveBackgroundResult createCGImage]创建图片识别时,系统也有判空逻辑,不会出现 Crash 那我不让它返回就好了。
于是我写个 demo 测试下Hook 掉这个行为, 用了下之前去家里的小猫照片。
- (void)viewDidLoad { WKWebView *webview = [[WKWebView alloc] initWithFrame:self.view.bounds configuration:config]; [self.view addSubview:webview]; [webview loadRequest:[NSURLRequest requestWithURL:[NSURL URLWithString:@"https://www.valiantcat.cn/dsn.html"]] ]; UIButton *button = [UIButton buttonWithType:(UIButtonTypeCustom)]; button.frame = CGRectMake(0, 0, 300, 200); [button setTitle:@"点击hook" forState:(UIControlStateNormal)]; [button setTitleColor:UIColor.redColor forState:UIControlStateNormal]; [self.view addSubview:button]; button.center = self.view.center; [button addTarget:self action:@selector(hook) forControlEvents:(UIControlEventTouchUpInside)];}- (void)hook { Class class = objc_getClass("VKCRemoveBackgroundResult"); SEL selector = sel_registerName("createCGImage"); Method m = class_getInstanceMethod(class, selector); const char *type = method_getTypeEncoding(m); IMP newImp = imp_implementationWithBlock(^CGImageRef(id self, SEL cmd) { return NULL; }); IMP oldImp = class_replaceMethod(class, selector, newImp, type); NSLog(@"%p", oldImp);}
可以发现,在 hook 后,长按图片不再有抠图功能。
综上猜测,觉得这个方案可行,于是咨询了下详情和容器,他们并未对 WKWebView 的默认行为做额外处理,并不太会影响手机淘宝的业务。于是准备上线。
不过在上线前突然发现, 淘宝里扫一扫和拍立淘有 visionkit 的使用,觉得有风险,又陷入了困境。

Diff 发现

突然想到既然代码是开源,并且只在 iOS 16.0..<iOS16.2 之间的版本有,是不是可以看下系统怎么偷偷摸摸修了bug。果不其然发现了蛛丝马迹,系统在多处 copy 图片的逻辑中都涉及一个图片长度尺寸的变更(但是我在打符号断点的过程中强制修改这个函数的入参,并不能造成同样的Crash)但是经过这个diff,可以更大概率的确认 Bug 来自 WKWebView 而不是 VisionKit。
Diff 链接:https://github.com/WebKit/WebKit/compare/releases/Apple/Safari-16.1-iOS-16.1.2...releases/Apple/Safari-16.2-iOS-16.2?diff=split

第二种解决方案

继续尝试从 WKWebview 排查。长按触发堆栈查找有用信息。
通过阅读代码后发现这是 iOS 16 新增的功能,同时在源码中查找到了是如何添加的手势
突然发现原来在 iOS 16 以前 WKWebView 里面只有一个手势,当长按时,会触发保存图片菜单。
在 iOS 16 以后,WKWebview 添加了两个手势,竞争用户的长按动作。
  • 超时逻辑验证
直接添加符号断点-[WKContentView imageAnalysisGestureDidBegin:]并添加 Command thread return 中断逻辑。发现果然会命中超时逻辑。
结合代码可以看到超时的菜单中没有 copySubject 逻辑。
  • 非超时逻辑
WKContentViewInteraction.mm(地址:https://github.com/WebKit/WebKit/blob/releases/Apple/Safari-16.1-iOS-16.1.1/Source/WebKit/UIProcess/ios/WKContentViewInteraction.mm)
抠图识别成功后,具有 CopySubject 菜单。

因此新的方案为 Hook WKWebView 长按手势图片识别能力。
static void hook2(void) { Class class = objc_getClass("WKContentView"); SEL selector = sel_registerName("imageAnalysisGestureDidBegin:"); Method m = class_getInstanceMethod(class, selector); const char *type = method_getTypeEncoding(m); IMP newImp = imp_implementationWithBlock(^void(id self,UILongPressGestureRecognizer *ges) { // do nothing }); if (m == NULL || class == NULL) { return; } IMP oldImp = class_replaceMethod(class, selector, newImp, type);}void hookStart() { if (@available (iOS 16.0, *)) { if (@available (iOS 16.2, *)) { return; } else { hook2(); } }}

线上观察

由于 Hook 长按手势后会导致 WKWebview 自带的抠图功能和文字 OCR 功能失效,担心有舆情风险。我们选择在手机淘宝安全气垫 SDK 实现此 Hook,并且通过放量修复。我们在 10.28.11 中通过放量来进行观察,发现Crash 从 500+ 跌倒了 67 (冷起生效,有时效性问题),可以确认修复有效,并且没有舆情反馈。全量后,经过观察,带有 Hook 方案的手机淘宝 Crash 基本跌 0,至此此 Bug 彻底修复。 日降低 Crash 1200+,影响设备 1000+ 。
总结
稳定性治理是一个长期的事情,由于前期同事的努力使得用户Crash 基本解决,一些操作系统的 Bug 逐步浮出水面,冲上排行榜,起初我并没有信心解决系统的 Bug,但是在定位过程中利用自己学习到的知识抽丝剥茧逐步定位到问题,也让自己对系统 Crash 不在畏惧,同时感谢同事在排查Bug 期间的经验输出和指导。
同时在定位过程中如有疑问或错误,欢迎讨论、指正。
参考资料
1. iOS app crashed on iOS 16 https://developer.apple.com/forums/thread/718305
2. The-ABI-for-ARM-64-bit-Architecture https://developer.arm.com/documentation/den0024/a/The-ABI-for-ARM-64-bit-Architecture/Register-use-in-the-AArch64-Procedure-Call-Standard/Parameters-in-general-purpose-registers
3. WebKit https://github.com/WebKit/WebKit
继续阅读
阅读原文