From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from rn-mailsvcp-ppex-lapp14.apple.com (rn-mailsvcp-ppex-lapp14.apple.com [17.179.253.33]) by mx.groups.io with SMTP id smtpd.web11.413.1624651505637154009 for ; Fri, 25 Jun 2021 13:05:05 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@apple.com header.s=20180706 header.b=NjpfXYwW; spf=pass (domain: apple.com, ip: 17.179.253.33, mailfrom: afish@apple.com) Received: from pps.filterd (rn-mailsvcp-ppex-lapp14.rno.apple.com [127.0.0.1]) by rn-mailsvcp-ppex-lapp14.rno.apple.com (8.16.1.2/8.16.1.2) with SMTP id 15PK1usp015661 for ; Fri, 25 Jun 2021 13:05:05 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : content-type : content-transfer-encoding : mime-version : subject : message-id : date : to; s=20180706; bh=KEcSwoep+N8OWEc/ibc8ddOvjswTIJfE1DKZd/0YOTQ=; b=NjpfXYwWqTo0JvSN5S2ltZ0+6UDYqSqDJew7dISoLRqO0jL0d/NMgz5zXPboPoIrgHaZ 04tS7dB1/juGbsjkjfy8WPOqG0CNcKA5ojDW/TYrYiuYd4l/CxuEpJl54TZq8d0ALC1B c9HLL2Fwbs6aTii3XWV0mPu74h3csRukMcYKB+OFvbJREeemDN3OMCh3j13bgrX2e/9q kD3l/DqwPj2SdsV1QEaKAfhuye4K2IEbgv/6lcZnnEctWslUXL9Xok0hkX5n9rD0zkLE yhafYde0Eyuo8w8PF3g9NxGCQOv+BkiBkBk/dPi0DoMT1sFDsHzSjfeCTJeB1hn4qwS3 Cw== Received: from rn-mailsvcp-mta-lapp03.rno.apple.com (rn-mailsvcp-mta-lapp03.rno.apple.com [10.225.203.151]) by rn-mailsvcp-ppex-lapp14.rno.apple.com with ESMTP id 39d5t6a134-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO) for ; Fri, 25 Jun 2021 13:05:05 -0700 Received: from rn-mailsvcp-mmp-lapp03.rno.apple.com (rn-mailsvcp-mmp-lapp03.rno.apple.com [17.179.253.16]) by rn-mailsvcp-mta-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) with ESMTPS id <0QV900QG0X4HD0E0@rn-mailsvcp-mta-lapp03.rno.apple.com> for devel@edk2.groups.io; Fri, 25 Jun 2021 13:05:05 -0700 (PDT) Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp03.rno.apple.com by rn-mailsvcp-mmp-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) id <0QV900B00WIC0R00@rn-mailsvcp-mmp-lapp03.rno.apple.com> for devel@edk2.groups.io; Fri, 25 Jun 2021 13:05:05 -0700 (PDT) X-Va-A: X-Va-T-CD: 5975dd1eaec8696b379f33739df9e0a8 X-Va-E-CD: f5d17499d4eafba93bec1eeb2b62a8fb X-Va-R-CD: 160fc7fcad9725e0a5affd04bb723e99 X-Va-CD: 0 X-Va-ID: 8ccf4c61-151d-47fd-868e-31fd79ab6ed2 X-V-A: X-V-T-CD: 5975dd1eaec8696b379f33739df9e0a8 X-V-E-CD: f5d17499d4eafba93bec1eeb2b62a8fb X-V-R-CD: 160fc7fcad9725e0a5affd04bb723e99 X-V-CD: 0 X-V-ID: a74cc568-09e3-40ab-83b6-c68471152c51 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.790 definitions=2021-06-25_07:2021-06-25,2021-06-25 signatures=0 Received: from [17.235.46.69] (unknown [17.235.46.69]) by rn-mailsvcp-mmp-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) with ESMTPSA id <0QV900IGUX4FIX00@rn-mailsvcp-mmp-lapp03.rno.apple.com> for devel@edk2.groups.io; Fri, 25 Jun 2021 13:05:04 -0700 (PDT) From: "Andrew Fish" MIME-version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.1\)) Subject: How to debug a BaseTools build Python hang? Message-id: <17E30409-0F4C-49AA-B094-960F824FDE04@apple.com> Date: Fri, 25 Jun 2021 13:05:03 -0700 To: edk2-devel-groups-io X-Mailer: Apple Mail (2.3654.20.0.2.1) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.790 definitions=2021-06-25_07:2021-06-25,2021-06-25 signatures=0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: quoted-printable I=E2=80=99m hitting an issue with deadlocks on our fork of the edk2 = build system. It only happens on our builders (30+ cores), and I can=E2=80= =99t repo locally. We realized we had fixed the number of threads, and = when we removed that from our build config we started seeing this issue.=20= The build is deadlocking during the `=E2=80=A6` so that implies during = the autogen phase. I hooked up a signal handler so I could catch `kill = !=3D 9` and dump the Python threads.=20 I see these threads: MainThread: seems stuck at the self.AutoGenMgr.join() in = StartAutoGen().=20 Thread-1: looks like a logging thread.=20 Thread-2: looks like the thread printing =E2=80=A6 2 x QueueFeederThread: I=E2=80=99m not sure if these are workers for = Python or kicked off by the build?=20 I=E2=80=99m looking for some advice on how to debug the deadlock. I=E2=80=99= ve tried sampling via kill -2 etc. and my exception handler but I = don=E2=80=99t seem to be able to see the work happening on Python = threads=E2=80=A6. Thread: QueueFeederThread(123145563672576) = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:774 function __bootstrap self.__bootstrap_inner() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:801 function __bootstrap_inner self.run() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:754 function run self.__target(*self.__args, **self.__kwargs) = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/mul= tiprocessing/queues.py:252 function _feed nwait() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:340 function wait waiter.acquire() Thread: Thread-2(123145555259392) = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:774 function __bootstrap self.__bootstrap_inner() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:801 function __bootstrap_inner self.run() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:754 function run self.__target(*self.__args, **self.__kwargs) ./BaseTools/Source/Python/Common/Misc.py:907 function = _ProgressThreadEntry time.sleep(self._CheckInterval) Thread: MainThread(4614635008) ./BaseTools/Source/Python/build/build.py:2754 function r =3D Main() ./BaseTools/Source/Python/build/build.py:2643 function Main MyBuild.Launch() ./BaseTools/Source/Python/build/build.py:2438 function Launch self._MultiThreadBuildPlatform() ./BaseTools/Source/Python/build/build.py:2246 function = _MultiThreadBuildPlatform Wa, self.BuildModules =3D self.PerformAutoGen(BuildTarget,ToolChain) ./BaseTools/Source/Python/build/build.py:2185 function PerformAutoGen autogen_rt, errorcode =3D self.StartAutoGen(mqueue, Pa.DataPipe, = self.SkipAutoGen, PcdMaList, cqueue) ./BaseTools/Source/Python/build/build.py:889 function StartAutoGen self.AutoGenMgr.join() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:940 function join self.__block.wait() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:340 function wait waiter.acquire() ./BaseTools/Source/Python/build/build.py:2737 function = AppleDumpAllThreadStacks for filename, lineno, name, line in traceback.extract_stack(stack): Thread: Thread-1(123145546846208) = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:774 function __bootstrap self.__bootstrap_inner() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:801 function __bootstrap_inner self.run() ./BaseTools/Source/Python/AutoGen/AutoGenWorker.py:85 function run log_message =3D self.log_q.get() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/mul= tiprocessing/queues.py:117 function get res =3D self._recv() Thread: QueueFeederThread(123145551052800) = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:774 function __bootstrap self.__bootstrap_inner() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:801 function __bootstrap_inner self.run() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:754 function run self.__target(*self.__args, **self.__kwargs) = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/mul= tiprocessing/queues.py:252 function _feed nwait() = /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/thr= eading.py:340 function wait waiter.acquire() Thanks, Andrew Fish