[a] We first had to turn PDF or its Microsoft Word sources into a spreadsheet format. Then, other challenges emerged such as character encoding (e.g., non-printable characters and those smart curly quotes Microsoft Office auto-corrects as default setting), combining multiple sources (e.g., CDASH v1.1 and CDASH User Guide v1.0), and minute details such as handling the NullFlavor details described in SDTM's Trial Summary (TS) domain -- is it intended to be registered as a CDISC Controlled Terminology in NCI EVS? Further, NullFlavor's governing authority can be HL7 or ISO 21090.
[b] A good example is identifying codelist supersets and subsets in the CDISC Controlled Terminology. For instance, Age Unit (C66781) is a subset of Unit (C71620) codelist.