We’ve heard a lot about AI rollouts like ChatGPT, and many folks are running to these tools thinking they can help enhance an ISO 9001 or AS9100 quality management system (QMS). The bad news is that these programs simply aren’t up to the task yet. The information spit out by ChatGPT is a mix of good information and outright-bad information, and the end user is unlikely to know the difference.
Recently I was faced with a time-consuming — and, frankly, annoying — word processing task. I had to take an ISO standard (not 9001, but a really obscure one) and convert it into an audit checklist. To do this, I traditionally convert the ISO “shall clauses” into questions as a first step. So:
The policy shall be available and be maintained as documented information.
Becomes:
Is policy available and maintained as documented information?
That’s the first step only because you need to customize the questions later, but you get the idea.
This is a huge time-suck. Every single sentence needs to be reshaped in this way. I’ve gotten good at using some search-and-replace phrases (converting “the organization shall” to “does the organization“), but it still requires every single sentence to be manually tweaked. This can take hours.
It sounded like a job well-suited for AI, right?
Start Your Engines
So I fired up ChatGPT and worked on crafting the right prompt. I ended up with something like “convert each sentence in the following text to questions which can yield a yes or no answer, while maintaining paragraph numbering.” The actual prompt was much more complicated, but — again — you get the idea.
It first took about five tries arguing with ChatGPT to deal with formatting. No matter how I prompted it, the result would not maintain the clause numbering. Instead, it would insist on spitting out numbered lists, even if I added “do not create a numbered list, but maintain exactly the original paragraph numbering” or somesuch demand. No luck.
After about fifteen minutes, I gave up and figured if it at least converted the sentences into questions, I could add the clause numbers manually later.
Because of the size of the standard, I had to enter it in chunks, since ChatGPT cannot handle large documents. So that added time, as well.
The output was a mixed bag. Where it worked, it worked phenomenally well. Many of the questions were converted with perfect grammar, and all were phrased to prompt a “yes/no” answer. In some cases, ChatGPT took the ISO standard’s many bulleted lists, and converted them into a single question, adding the bullets to the question, which I liked. In other cases, it was smart enough to know when to break the bulleted list items into separate questions.
But ChatGPT failed enough times to make it wholly unreliable. In some cases, it skipped over entire paragraphs, and I was forced to do a line-by-line reading of the original to see what was missed, and craft the questions manually. In other cases, it forgot key bullet items. And, in still other cases, the question generated didn’t really ask what was being prompted by the original sentence, and the meaning changed entirely.
That last one was a killer. This meant that I still needed to spend hours reading the output and editing each sentence, to be sure it reflected what ISO was really asking for, and to ensure all requirements were addressed. Then I had to go back and add clause numbers.
In the end, I likely saved only about one hour from my original workflow. The entire thing still took about hours, and that was before I could customize the questions for the client.
Not Ready for Prime Time
For blindingly simple tasks, ChatGPT works. As a test, I asked it to write an ISO 9001-compliant Quality Policy, and no matter how many iterations I asked for, each satisfied the requirements. this was likely because there are so many Quality Policies published online for ChatGPT to learn from. But writing a Quality Policy isn’t exactly hard, since ISO 9001 tells you what you need to put in it.
In other cases, ChatGPT just outright lied. While working on a complex Excel sheet, I sensed there was a better way to calculate my data, and thought there just might be an Excel formula I didn’t know about which might help. I threw my problem into ChatGPT and it spit out the necessary formulas, but the result was entirely not what I wanted. In the end, I found that Excel simply cannot do what I was asking it to without macros of VBscript, because the feature isn’t included. ChatGPT, on the other hand, insisted that Excel could do it, and kept spitting out totally incorrect answers.
The temptation may be to have ChatGPT write your ISO procedures for you. We will ignore the fact that an AI program can’t possibly know how your company does business, so will only spit out generic template-style documents; the people using ChatGPT to write ISO procedures probably don’t care about accuracy. But as it stands right now, AI is simply not capable of generating generic procedures that fully comply with ISO 9001. It just can’t do it.
As the technology stands today, yes, it might save a few hours of work, but only that. The risk, however, is that it might add more hours of work to scour the output for errors which — in the ISO world — could lead to audit nonconformities.
That is not to say that ChatGPT and its AI brethren won’t be good for these tasks, as the technology improves. I fully suspect it will, and very soon. But as of right now, I urge users to be very careful when using AI in these use cases. You don’t want to fail your audit or lose your certificate because ChatGPT is still kinda stupid.
Christopher Paris is the founder and VP Operations of Oxebridge. He has over 30 years’ experience implementing ISO 9001 and AS9100 systems, and helps establish certification and accreditation bodies with the ISO 17000 series. He is a vocal advocate for the development and use of standards from the point of view of actual users. He is the writer and artist of THE AUDITOR comic strip, and is currently writing the DR. CUBA pulp novel series. Visit www.drcuba.world