LOADING

Building a Concurrency Testing Framework for database transactions using Scala

Thang Long Vu
Databricks, Software Engineer
About This Talk

Context: In highly concurrent distributed systems, handling tens of millions of operations is a massive challenge! Delta Lake, which relies on Optimistic Concurrency Control, avoids locking entirely - allowing transactions to proceed in parallel without knowing each other’s existence. Instead, we resolve conflicts at commit time.

Handling concurrency in Delta Lake means navigating through multiple scenarios that are tough to catch, given how transactions execute independently:1. Different operations can conflict in unpredictable ways, and we need to track them with fine-grain accuracy.2. We need a way to easily interleave operations, pause them at a precise phase, and check for conflicts.

Solution: Thanks to Scala, we were able to build a powerful, idiomatic, and easy-to-use testing framework with the following benefits:

1. Fine-grained error catching: We can pause an operation mid-execution, allowing us to perform our validations.

2. Reusable building blocks: Creating generic mixin helpers, or building blocks that are easily reusable for different operations, and can scale to any number of transactions.

3. Focus on conflict scenarios: Instead of worrying about how to implement tests, the framework lets us spend more time thinking through the real problem—which is reasoning about the complex interactions and resolving them.

4. Abstracted setup and cleanup: Test code that focuses on testing, and abstracts away the setup and cleanup, allowing users to write tests easier.

Goal: From this talk, I hope to:

1. Spread the main ideas of the Concurrency Testing Framework, I have not found another similar framework.

2. How to use Scala’s functional programming power to build a clean, idiomatic, and reusable testing framework.

3. Express my passion for Scala and databases.

4. This will be my first time doing live on-stage conference speaking if I get accepted, it will certainly be a thrilling but rewarding experience.

5. Hope to hear from other Scala enthusiasts for knowledge sharing and idea cross-pollination.

References:

Some notable PRs and resource:

https://github.com/delta-io/delta/commit/01c0ef91565d9a3e06fb9f62295f7c80fba09351

https://github.com/delta-io/delta/commit/615c184a3677afb26b7ca82c630ebe595421555b

https://github.com/delta-io/delta/commit/dbdcd0b143d1b13b4f06f1928c0e19607145edf3

https://www.databricks.com/blog/2019/08/21/diving-into-delta-lake-unpacking-the-transaction-log.html

more great talks

Might Be Interesting

Day 2
  —  
arrow pointing right icon

Unison, from a Scala perspective

This talk will be a quick introduction to the Unison "paradigm" and language, from the perspective of a long-standing Scala programmer.

Day 2
  —  
arrow pointing right icon

Going structural with Named Tuples

Scala 3.6 stabilises the Named Tuples proposal in the main language. It gives us new syntax for structural types and values, and tools for programmatic manipulation of structural types without macros. Can we, and should we, push it to the limit? Of course! let's explore DSL's for config, data, and scripting, for a more dynamic feel.

Day 2
  —  
arrow pointing right icon

Durable Event-sourced Workflow Monad... Seriously!

In this talk, I'll walk you through how workflows4s works, how it stands apart from tools like Temporal or Camunda, and why it just might be the better approach for modern, event-driven applications.

Day 1
  —  
arrow pointing right icon

Better Scala builds with the Mill build tool

This talk will introduce Mill: a newer build tool that does everything SBT does, but better. Faster, simpler, easier, Mill democratizes the build so you don't need to be a build tool expert to work on it.

Day 1
  —  
arrow pointing right icon

From Zero to Production Faster Than Your Average Meeting with Pillars

I will demonstrate how Pillars can take you from zero to production in record time. By leveraging Pillars’ integration of well-known libraries, you can bypass the usual complexities of setting up observability (traces, metrics, and logs), database access, API calls, and feature flag management.

Day 1
  —  
arrow pointing right icon

Slow-Auto, Inconvenient-Semi: escaping false dichotomy with sanely-automatic derivation

In this presentation you will learn the source of your issues, and a third way - sanely-automatic derivation which is fast to compile, fast to run, and easy to debug by its users.

See All Events
Join us!

We're looking for amazing speakers.
CFP is open till 10.01.2023

Fill in Call for Papers
location icon

Location

Centrum Konferencyjne POLIN, Poland
stay in touch icon

Follow Us

Contact Us