Back in the 1800s, there was a low chance of making an oil well in the United States. People already knew about petroleum "seeps", a place where natural liquids escape to the Earth's surface, but it was too expensive and complex to skim it off the water for hundreds of years.

It wasn't until 1859 that Colonel Drake decided to change the story. His strategy was to use a steam engine to ram the drill through the soil of Pennsylvania in the hope of finding oil. Still, not only did he run out of money midway through the work, but his effort also brought ridicule and was known derisively as “Drake’s folly”1, as townsfolk doubted his plans would work. None of that stopped Drake, and he took a loan2 to keep the operation going until he found oil 69.5 feet under the ground. Today, his well is known as the Drake Well and is located at the center of the Drake Well Museum. It is the first commercial oil well in the United States, and was designated a National Historic Chemical Landmark in 2009.

He changed the game by exploring new areas to look for the oil instead of exploiting the seeps that were already found. The industry jumped on the idea, and exploring for new wells and approaches to find oil mines skyrocketed.

That said, exploiting the wells already found never stopped. Although explorers expanded west from Ohio to Texas, the trade-off between exploring and exploiting never vanished. Should we keep squeezing the old wells and optimize extracting them, or should we take the risk of drilling into new places?

Same terminology exists in software

There are only two hard things in Computer Science: cache invalidation and naming things.

-- Phil Karlton

We borrow terminology and literature from different fields of science and incorporate them into computer software. Not only because naming is hard, but also because those terminologies fit so well with the software problems. Software is a fairly young industry, and other industries faced some challenges way longer before we did. As a result, we directly imported the terms "explore" and "exploit" into the software literature to use them in soft computing approaches.

Imagine an administrator who wants to hire the best secretary out of nn rankable applicants for a position. The applicants are interviewed one by one in random order. A decision about each particular applicant is to be made immediately after the interview. Once rejected, an applicant cannot be recalled. During the interview, the administrator gains information sufficient to rank the applicant among all applicants interviewed so far, but is unaware of the quality of yet unseen applicants. The question is about the optimal strategy (stopping rule) to maximize the probability of selecting the best applicant. If the decision can be deferred to the end, this can be solved by the simple maximum selection algorithm of tracking the running maximum (and who achieved it), and selecting the overall maximum at the end. The difficulty is that the decision must be made immediately3. This is the famous secretary problem.

The decision between exploring and exploiting is not challenging here. In the case of the secretary problem, we know for a fact that we should explore only 37% of the secretaries (equal to 1/e1/e), and then pick the first item with a better score than the explored items. This solution is mathematically provable. However, this is not always the case: in most problems, the decision between exploration and exploitation is up the creek without a paddle.

Soulmate algorithm by icanbarelydraw.com

The Soulmate Algorithm I (by icanbarelydraw.com 4)

In many software problems, we are dealing with an enormous input and output space. In this situation, soft computing approaches come into play for optimizing high-level solutions of the problems that are unsolvable in polynomial time. Think about it: how would you find the next best move in chess? The search space is astronomical, and trying every move would take forever.

One approach is to generate a solution and then evaluate if that's a good one using a fitness function. We can now decide to either exploit by working on the solution and improving it, or to explore by generating entirely new ones. That decision depends on the value of the fitness function. Similarly, that's how we optimize solutions in metaheuristic procedures like genetic algorithms or simulated annealing.

Example of Exploration vs Exploitation in 8 queens

The image above illustrates the exploitation vs exploration in the 8 queens problem. In this puzzle, one should place eight chess queens on an 8×8 board so that no two queens threaten each other; thus, a solution requires that no two queens share the same row, column, or diagonal. Here we found a solution that has only three threats, but the optimal solution should have zero. Now, we can either exploit it and slightly modify our solution to see if we get a better result, or forget about it and explore a new input.

The Right Exploits, The Left Explores

In politics, we encounter optimization problems frequently. Many parameters need to be optimized, and optimizing them doesn't have a definite solution, especially in such a dynamic, complex environment where parameters evolve rapidly. Based on the status quo, we need to make so many decisions. For example:

  • Should we seek to grant new rights to underrepresented groups (explore), or should we protect the already established culture and keep it as it has been working for a long time (exploit)?
  • Should we practice the religion that has worked for thousands of years (exploit), or should we reduce the weight of religion and allow new traditions to emerge (explore)?
  • Should we lower the taxes so business owners can emerge and grow (exploit), or should we increase the taxes to help the working class (explore)5?
  • Should we be tough on criminals and empower law enforcement (exploit), or should we try to reduce the causes of crime and act like a father admonishing his children (explore)6?
  • Should we increase the rate of accepting immigrants (explore), or should we slow down the process (exploit)?
Political Steering Wheel
Political Steering-wheel

Each of the above questions somehow depicts a situation of exploration versus exploitation. Deciding which policy to implement is like steering a helm, and in a democratic country, it is the people who decide which direction to steer.

That means every time the left-wing fights the right-wing and vice versa, it's for the greater good. This helps us optimize the solution and get one step closer to the Goldilocks point. Of course, People can make mistakes. However, sooner or later, the fitness function can reveal whether the direction taken was a good choice. Mistakes can be corrected in subsequent iterations.

There remains only one question. What is the fitness function of your country?

References and Footnotes

Footnotes

  1. Drake’s Pa. Oil Well Idea Changed World. [Online]. timesleader.com/archive/1241892/drakes-pa-oil-well-idea-changed-world

  2. "When First Oil Flowed". The New York Times. July 22, 1934. p. XX-12.

  3. Secretary problem. [Online]. en.wikipedia.org/wiki/Secretary_problem

  4. Soulmate Algorithm — I. [Online]. icanbarelydraw.com/comic/397

  5. Leftist governments are explorative governments because historically we didn't care about workers and it was only after the Industrial Revolution that things changed rapidly.

  6. Historically, punishment for criminals has often been harsh and merciless. However, with the advancement of psychological science, we have come to the conclusion that perhaps we can take a more lenient approach.

  7. Blake, Aaron (25 November 2021). "Why are there only two parties in American politics?". Washington Post. ISSN 0190-8286. Retrieved 25 September 2023.

  8. Palfrey, T. (1989) ‘A mathematical proof of Duverger’s law’, in P. Ordeshook (ed.) Models of Strategic Choice in Politics, Ann Arbor: University of Michigan Press, 69–91.