Menu

AI in Context: First Test of Several Generative AI Code Generators

AI in Context: First Test of Several Generative AI Code Generators

Generative AI has many applications, including code generation. I usually program in Python, so I tested five code generators to see how they performed on an HTML file search operation. I restricted myself to browser-based generators. I did not test all such generators, but the results show an interesting range in quality and completeness.

The Task: Extracting the Title from an HTML File

I do a lot of text manipulation using Python. For example, the text source for my book Dancing with Qubits, Second Edition, combines HTML and LaTeX. I process the HTML and pull out the LaTeX in PRE blocks that have a special class designator. I run these through pdflatex to create small PDF files and then use ImageMagick to generate JPG images. I then insert IMG tags with links to the images into the final HTML file.

I also have special SPAN tags with HTML math in the content areas and LaTeX markup in an attribute. This combination allowed Packt Publishing and me to create an eBook, a PDF for printing the book, and a color-highlighted PDF for purchasers of the other two forms. Sometimes, I think it took me longer to write the processing software than the book itself.

Over several years, I wrote all the code and used an earlier version to generate my Dancing with Python book. I use certain idioms repeatedly, especially to parse HTML files. The workhorse Python package I use is beautifulsoup4. It has methods to read a file or string containing HTML and then search or navigate the document tree.

Since I had not used a generative AI code generator before except for regular expressions, I thought a fun initial task would be asking several of them to write code to read an HTML file and print the TITLE tag’s contents. There are more code generators than I have shown, and I have only used browser-based prompt and response services.

Many AI code generators have extensions for Integrated Development Environments (IDEs), such as Visual Studio Code. I will examine some of these in a future Analyst Insight.

The Prompt

The prompt I used with each code generation service is:

Write me Python code that parses an HTML file and prints the title.

I will score the code this way:

    • 0 points total if the code is not Python. I stop looking at the code.
    • 5 points if it reads a file, 1 point if it uses a with statement, and 1 point if it specifies an encoding.
    • 5 points if it checks that the HTML file exists.
    • If it doesn’t read a file, 1 point if it reads a string.
    • 5 points if it finds the TITLE tag and contents.
    • 1 point if it checks that the title exists
    • 2 points if it prints the title.

The code should be in Python, read an HTML file using modern Python syntax, understand that the title is in an HTML tag, extract the tag’s contents, and print the result. The maximum score is 20.

ZZZ Code AI

The input:

AI in Context: First Test of Several Generative AI Code Generators

The output:

AI in Context: First Test of Several Generative AI Code Generators

The score: 8 = 1 (reads string) + 5 (finds title) + 2 (prints title)

The main issue with this code is that I asked for code to read a file, and it reads a string instead.

Codeium

The input:

AI in Context: First Test of Several Generative AI Code Generators

The output:

AI in Context: First Test of Several Generative AI Code Generators

The score: 13 = 6 (reads file using with) + 5 (finds title) + 2 (prints title)

This code did not get the points for checking that the file exists, using file encoding, or checking that the title existed.

Microsoft Copilot

The input:

AI in Context: First Test of Several Generative AI Code Generators

The output:

AI in Context: First Test of Several Generative AI Code Generators

The score: 14 = 6 (reads file using with) + 5 (finds title) + 1 (checks title exists) + 2 (prints title)

Very nice touch at the end noting that I should change the file name in the code! The code should have checked that the file existed.

OpenAI ChatGPT 3.5

The input:

AI in Context: First Test of Several Generative AI Code Generators

The output:

AI in Context: First Test of Several Generative AI Code Generators

AI in Context: First Test of Several Generative AI Code Generators

The score: 15 = 7 (reads file using with and encoding) + 5 (finds title) + 1 (checks that the title exists) + 2 (prints title)

This is great code, except it doesn’t check if the file exists. The extra factoring out of the function to find the title is very nice, as is the instruction for installing the beautifulsoup4 package.

CodePal

The input:

AI in Context: First Test of Several Generative AI Code Generators

The output:

AI in Context: First Test of Several Generative AI Code Generators

AI in Context: First Test of Several Generative AI Code Generators

AI in Context: First Test of Several Generative AI Code Generators

The score: 19 = 5 (check if file exists) + 6 (uses file and reads it using with) + 5 (finds title) + 1 (checks that the file exists) + 2 (prints title)

This is the most complete code with the top score, only losing a point by not specifying a file encoding. It has extensive error checking, something I would require in production quality code.

Key Takeaway: Generative AI code generators can give developers a head start in developing applications. Human intervention may be required to ensure the code checks for errors and uses modern coding practices and libraries.

I was impressed at first with the code these five services generated. They appeared to use the standard Python package to perform the requested action. On closer comparison, one did not follow the prompt well, and most others had limited to no error checking. Do not use generated code without carefully inspecting what it does and how. Though my example did not have security considerations, you should carefully examine the code to guarantee it does not open holes that hackers could exploit.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other Insights from The Futurum Group:

AI in Context: UXL to Be an Open-Source Alternative to NVIDIA’s CUDA?

Quantum in Context: A Qubit Primer

AI in Context: Remarks on the NVIDIA GTC 2024 GenAI and Ethics Panel

Author Information

Dr. Bob Sutor

Dr. Bob Sutor is an expert in quantum technologies with 40+ years of experience. He is the accomplished author of the quantum computing book Dancing with Qubits, Second Edition. Bob is dedicated to evolving quantum to help solve society's critical computational problems.

Related Insights
Elastic Q3 FY 2026 Strong Quarter, but Reacceleration Thesis Unproven
March 3, 2026

Elastic Q3 FY 2026: Strong Quarter, but Reacceleration Thesis Unproven

Nick Patience, VP and Practice Lead for AI Platforms at Futurum reviews Elastic Q3 FY 2026 earnings, highlighting sales-led subscription momentum, AI context engineering adoption, and agentic workflow expansion across...
Google ADK Is Not a Toolkit – It Is an Agent Execution Framework
March 3, 2026

Google ADK Is Not a Toolkit – It Is an Agent Execution Framework

Mitch Ashley, VP and Practice Lead of Software Lifecycle Engineering at Futurum, shares his insights on how Google ADK’s new integrations turn agent frameworks into an execution layer, connecting GitHub,...
CoreWeave Q4 FY 2025 Results Highlight Backlog Growth And Capacity Expansion
March 3, 2026

CoreWeave Q4 FY 2025 Results Highlight Backlog Growth And Capacity Expansion

Futurum Research reviews CoreWeave’s Q4 FY 2025 earnings, focusing on backlog-driven capacity expansion, platform monetization beyond GPUs, and execution cadence shaping AI infrastructure supply....
Snowflake Q4 FY 2026 Results Highlight AI-Led Consumption and Platform Expansion
March 2, 2026

Snowflake Q4 FY 2026 Results Highlight AI-Led Consumption and Platform Expansion

Brad Shimmin, Vice President & Practice Lead at Futurum analyzes Snowflake’s Q4 FY 2026 earnings, highlighting AI-driven consumption growth, expanding platform scope, and guidance shaping expectations for FY 2027....
Collapsing the Stack VAST Data’s Bid to Own the AI Data Loop
February 27, 2026

Collapsing the Stack: VAST Data’s Bid to Own the AI Data Loop

Brad Shimmin, Vice President at Futurum, analyzes the VAST Data platform updates from VAST Forward, detailing how the new Policy Engine, Tuning Engine, and Polaris architectures are simplifying the AI...
Are Enterprises Ready for the Virtualization Reset, or Just Swapping Out One Complexity for Another
February 27, 2026

Are Enterprises Ready for the Virtualization Reset, or Just Swapping Out One Complexity for Another?

Futurum’s Alastair Cooke shares his insights on new HPE research that finds that only 5% of enterprises are fully prepared for the so-called Great Virtualization Reset, even as two-thirds plan...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.