The Brief on AI #11

What the Dutch government can learn from Mark Zuckerberg and ASML

Feb 01, 2024

Recent developments

The Netherlands adopt Europe's AI Act and invest €205 million in the sector
Big Tech is under increasing pressure from authorities for the way they are dominating the generative AI sector
Meta is open sourcing artificial general intelligence (AGI)

Quote of the week

“If markets are leading us in directions that we, as a democratic society, decide are not compatible with our vision of liberty or democracy, it is incumbent upon government to do something.”
Lina Kahn, Chairwoman of the FTC

What to make of all of this

Meet the MANGs. Microsoft, Amazon, Nvidia and Google, four of the five biggest US companies by market value. They collectively generated $276bn of operating profits in 2023 and are completely dominating the generative AI market. So much so that they are pushing the boundaries of what’s allowed under the antitrust law.

One issue is how the MANGs are mimicking the role of venture capitalists with a unique self fulfilling approach. Instead of only giving money to startups, they offer them access to their resources like computing power, chips, and cloud services. They then incorporate the AI tools that derive from the partnership into their own products. A great way to lock in customers while juicing your own revenues. Bill Gurley, general partner at Benchmark -a Silicon Valley venture capital firm- warns that this way of doing business leads to “a massive mess in the end”.

Lina Khan, aged 34 and regarded as the most feared woman in Silicon Valley, is serving as chairwoman of the Federal Trade Commission since 2021. She is worried the MANGs’ partnerships are “distorting innovation and undermining fair competition” which led to the FTC initiating investigations. Additionally, competition authorities in both the UK and the EU are scrutinizing Microsoft's financial involvement in OpenAI.

At the other end of the spectrum we see Meta promising to open source AGI. Meta took the open source route last year with Llama 1 and 2 which has been a great success for both the startup ecosystem and Meta. When Meta made the move Zuckerberg informed investors he didn’t expect Llama to generate “a large amount of revenue in the near term, but over the long term, hopefully that can be something.” It appears his prediction was accurate.

Meta makes money from cloud computing companies like Microsoft and Amazon which offer access to Llama as part of their own generative AI services. This has lead to 30 million monthly downloads in September 2023. Open sourcing has resulted in significant cost reductions across Meta’s AI operations. Meta has effectively crowdsourced a large part of their R&D by engaging a large community of developers to refine and advance Llama. The project attracts the brightest minds in AI, empowering Meta with a pipeline of potential employees and collaborators.

Why am I telling you all of this? Because the Dutch government should take a close look at Mark Zuckerberg’s strategy. They’ve written a paper expressing their view on generative AI and allocated €202 million euro to invest in the technology. I applaud this, but I also foresee the government spending this budget on a wide range of goals and achieving nearly nothing. If I were in charge of the money I would do the following:

Build an entirely new Large Language Model (LLM) and train it with Dutch data;
Change data policies so public data (including all case law) becomes available for training purposes while respecting privacy rules and regulations;
Set up deals with local publishers who can provide training data to prevent copyright infringement;
Open-source the model so other European countries and startups can use and train it with their data;
Foster a robust ecosystem of innovation and applications by local startups and researchers.

Is this an outrageous thought? Maybe. Is it financially feasible? €205 million is not a large budget, but enough to create a model that stands up against the top contenders currently present in the market. To give you some context, Jim Fan, a senior AI scientist at Nvidia, mentioned that training Llama 2 probably cost Meta around $20 million and around $2.4 million to train its forerunner. If the French can do it with Mistral, why couldn’t the Dutch with "Zephyr"?

Taking this approach brings significant advantages for the Netherlands as AI is increasingly playing a central role in the economy. The country can become a magnet for international talent and place itself all the way at the forefront of the AI value chain. This is how ASML became one of the worlds leading companies in the semiconductor industry. Not by building the chips, but by building the machines that build the chips.

AI tip of the week

Removing time stamps and line breaks from a transcript without getting a summary

In last week’s post I explained how to use Microsoft Teams to make transcripts of meetings. I gave some tips in handling transcripts of messy meetings. But what if meetings aren’t messy and you want to use the entire transcript to generate minutes or create an article? When this is the case I run into problems.

The format in which I download the transcript from Teams contains time stamps and unnecessary line breaks.
When I ask GPT to remove the time stamps and line breaks it makes a summary and I lose important information.

I tinkered with it for a while and nothing seemed to work, until I went back to my original prompt engineering notes from last summer. It’s very simple and it works!

Here is how:

download the transcript
copy the first 10 lines of the transcript
open your chatbot of choice and write: This is the input:
paste the 10 lines under your text
Then write: This is the output:
paste the 10 lines again and remove the time stamps and unnecessary line breaks
Then write: Execute this for the following text:
Paste the entire transcript under you text and hit enter.

And voila, the bot will generate the entire text for you without summarizing it. Make sure to save the prompt. This way next time all you have to do is replace the text of the transcript.

The Brief on AI