Message Queue Dumping

Message Queue Dumping
MPI Side Document
• MPIR and MQD were designed around 1995
• MPIR: Process discovery
• Formally described in the MPIR document in 2010
• MQD details were omitted
• MQD: Message Queue Dumping
• Many MPI implementations support it and almost all debuggers use it
• Not yet documented
The MQD document
• V1.0: Documenting existing interface and practice
• Vnext:
• Merge with the MPIR document to be one side document
• Introduce handle introspection interface
• Other improvements
Message Queue Dumping Interface
• Tools and Debuggers use MQD to extract information about messages
• Practically, only debuggers (need symbols)
• Typically there are three conceptual queues:
• Unexpected Receive
• Posted Receive
• Send
• The MQD allows the following
• Querying if MQD is supported
• If yes, allows for iterating over active communicators.
• Within each communicator, allows for iterating over each message
Terms and Definitions
• DLL: An overloaded term that refers to either: dynamic-link library,
dynamically loaded library, dynamic shared object file
• Image file: An executable or a shared library
• May contain symbols necessary for MQD
• MPI Process: An OS process that is part of the MPI application.
• MPI standard does not require OS process. It’s debuggers’ requirement.
• mqs_image: abtract concepts that represents collection of image files
• Loaded into the process address space
• Can change/relocate during runtime
• Host: Where we run the debuggers
• Target: Where the MPI process resides
• Provided by the MPI implementation
• Debugger does not need to know the internal implementation
• Dynamically loaded by the debugger
• Found by: value of MPIR_dll_name symbol in the process
• Invoke debugger’s provided functionalities through tables of callbacks
• Debugger calls routine from the MQD DLL during setup to provide cb’s
• This allows the debug DLL to work with any MQD-complaint debugger
Debugger/Debug DLL interaction
• See document
What’s next
• Can’t just put MQD at the end of MPIR – Need to merge them in a
way that makes sense
• Handle introspection interface
• MPIR_dll_name is not sufficient
• 64-bit MPI process has compiled with MPIR_dll_name pointing to 64-bit
debug DLL. 32-bit debugger needs to load 32-bit DLL!
• Resource management for MPIR
• How to know if we’re launched under srun, apprun, mpirun.
• What is the job number, etc.
• Connectivity information (what are the state of the connection
between MPI processes)

similar documents