Marie Claire Chelini, Trinity Communications
“One could argue that this started over 25 years ago.”
That’s how Steffen Bass described his collaboration with Assistant Professor of Statistical Sciences Simon Mak. Twenty-five years ago, however, Mak wasn’t even in high school.
But this isn’t the story of a teenage prodigy. This is the story of how new tools, developed by a junior scientist thinking outside the box, can shed light on old questions — even if the questions are hidden within something smaller than an atom.
“An atomic nucleus behaves much like a liquid drop,” said Bass, who is the Arts and Sciences Distinguished Professor of Physics. “For a regular liquid like a glass of water, you put it on your stovetop, and it’ll turn to vapor — a gas — at 100° C. You put it in the freezer, and it’ll turn solid at 0° C.
“If I take my glass of water up onto Mount Everest, where the air pressure is much lower, it’ll boil around 70° C. That’s well known, and it's kind of boring, right? Now, at what temperature and at what pressure does my liquid drop atomic nucleus change from a liquid to a gas, or a plasma?”
But how do you heat up or compress something as small as an atom’s nucleus? Not with a stovetop or with Mount Everest, but with a particle collider. As two nuclei crash into each other they compress, heat up and may change from a liquid to a gas or a plasma. But there’s a big problem: not only are the atoms incredibly small, but their collision is incredibly fast. It’s all over in 10-24 seconds.
These collisions are akin to a car crash test, but they can’t be replayed in slow motion. “What we get is this debris field,” said Bass, “and we have to try to reconstruct the crash based on it.”
These reconstructions are complex computer simulations with multiple parameters that can be tweaked until the simulation matches the debris field. Considering that there are millions of possible combinations of parameters, and that each simulation might take 10,000 CPU hours, the problem becomes unmanageable. “I once calculated that the amount of computer time we’d need to run these simulations is several times greater than the age of the universe. We’re talking quadrillions of CPU hours,” said Bass. In other words: it's impossible.
“That's where machine learning comes into play,” said Mak.
Mak uses AI — specifically machine learning — to interpolate the results of each simulation, so that instead of trying every possible combination of parameters, researchers run only the most likely ones.
Bass had been successfully collaborating with statisticians on such methods for years when Mak first arrived at Duke. “Simon was young and hungry. He really took this to the next level and came up with a bunch of new ideas that went way beyond what we had been doing.”
Their collaboration is part of JETSCAPE, a multi-institutional project funded by the National Science Foundation. Using Mak’s methods, the team has been able to shave up to 10 orders of magnitude — 10 zeros — of CPU hours off each problem, turning them into something that a supercomputing center can tackle in a couple of days.
Mak accelerates the particle physics simulations by creating “digital twins.”
“Think about building a rocket,” he explains. “Traditionally, these were built in labs, people had to prototype each engine part, build this billion-dollar project and then test it. With recent developments in high-performance computing, people are realizing, ‘look we can actually code a lot of this stuff right on the computer.’ And that’s what's called digital twins. They replicate the actual physical process, but don't incur the cost of building it.”
Just like digital twins help optimize rocket design, they can optimize the reconstruction of a high-energy particle collision. But there is an art to building an efficient digital twin without losing accuracy.
“A large part of my work is finding creative ways to relax this trade-off, by fusing data or integrating prior information that can limit the number of possibilities we need to explore,” said Mak. “It's not just crunching the math. You have to be creative in what data sources you can leverage and in how you leverage them.”
Even though much Mak’s current research relates to high-energy nuclear physics, the potential behind his methods goes far beyond the physical sciences. A quick glance at his research program shows collaborations on topics ranging from 3D-printed aortic valves to drone design.
The common thread between these areas? Interesting problems that cannot be solved in a reasonable amount of time.
“In my view, it's my job as a data scientist to be a problem solver,” he said. “Everyone uses data in different ways and for different goals. Part of the job is to understand these goals and the problem’s unique characteristics, then design methods that can tackle them with interesting theory, algorithms, statistical and mathematical properties. But don’t just stop at the method design: take the next step, and actually solve the problem.”