Auto Added by WPeMatico

viktor: Efficient Vectorized Computations in Kotlin

Introducing viktor


is an open-source Kotlin library developed by JetBrains Research that aims to make array calculations more efficient. We achieve this by avoiding nested arrays, delegating expensive operations to JNI + SIMD, and providing built-in support for arithmetics on logarithmically-stored numbers.

This post is in celebration of the 1.1.0 release. We will discuss what the library does (and what it doesn’t), how it came to be, and what lessons we’ve learned while developing it.

Please try the library out! It’s as simple as looking at some examples and adding a line to your Gradle script. If you’re still not convinced, take a look at our benchmark results.

What we do


has been optimized to work with probability arrays, since it is primarily intended for model-based machine learning tasks. For example, we use it in our peak analyzer Span (a bioinformatics tool that detects enriched regions in genomic sequencing data) to fit the underlying hidden Markov model. In particular, we offer:

  • Idiomatic multidimensional array access (rows, columns, slices, views, etc.).
  • Extremely fast element-wise operations (arithmetic, exponent, logarithm, and the like)
    utilizing modern CPU cores to their full extent.
  • Really fast aggregate operations (sum, mean, standard deviation, etc.).
  • Built-in logarithmic storage support: you can convert your values to logarithms and
    work with them without having to convert them back.

What we don’t

  • viktor

    isn’t a linear algebra library – at least not yet. You can get it to multiply matrices, but it’s not optimized for that. If you need lightning-fast matrix multiplication, you’d be better off using Multik, nd4j, or another linear algebra library.

  • viktor

    doesn’t currently have out-of-the-box concurrency (like


    has), though it can be parallelized on the client side.

  • viktor

    doesn’t do GPU computations. Many researchers use their laptops for work, while others have access to multi-core servers and even computational clusters. What unites all these setups is that they all have little to no GPU capabilities.


    is intended for exactly those kinds of cases.



sources are hosted on GitHub, together with a feature overview and some instructional examples.

Starting with version 1.1.0,


binaries (together with sources and Javadoc) are distributed via Maven Central, so it’s really easy to add to any Maven-like JVM project:


Gradle / Gradle Kotlin:


The older versions were published on Bintray (currently in the sunset phase) and can be downloaded manually from GitHub Releases.


The main structure,


, was inspired by NumPy’s ndarray.

Inside, it’s a flat




) endowed with an


and two n-element integer arrays containing the n-dimensional array’s




. This structure allows us to easily express rows, columns, and other slices of an


as other



For instance, a 2×3 array will be stored as a 6-element



offset = 0


shape = {2, 3}

, and

strides = {3, 1}

. If we want to view the second column (the one indexed 1), we just create an array with the same


, but with

offset = 1


shape = {2}

, and

strides = {3}

. It is simplicity itself!



comes with a sizable set of arithmetic and mathematical operations. While the cheap arithmetic operations are performed in a loop, the more expensive mathematical ones are delegated to the Java Native Interface. Moreover, we make sure to utilize the SIMD (single instruction, multiple data) extension sets available on most modern CPUs. The performance gains depend on the operation, the array size, the JVM version, and so on, but can reach 900% even in real-world cases (as it turns out the JVM’s logarithm is pretty slow).

Another useful feature is the built-in support of log-stored values. When working with probabilities, floating-point underflows are a frequent occurrence, since sometimes you have to multiply so many small numbers together that the result can no longer be expressed as a positive number and instead is rounded to zero, losing any utility. To overcome this, people frequently store not the probability itself, but rather its logarithm. Instead of multiplying the probabilities, they can then sum the logarithms. However, they may occasionally need to sum probabilities as well, and this operation is much less natural with logarithmic storage. So


provides a function named


, which does exactly that:

a logAddExp b = log(exp(a) + exp(b))

but in a way that prevents underflows. It is also possible to sum all the values in a log-stored array with


. These operations are also SIMDized whenever possible, achieving even better performance.


binary distribution currently supports one CPU architecture (

amd64 / x86_64

) with two extensions (SSE2, AVX) on Windows, Linux, and macOS. We use TeamCity for the multi-platform build.

Why we needed it

This whole project started because we just wanted to train some mathematical models for our research. However, the training was too slow for our tastes, even after adding concurrency. We used a profiler, and the results surprised us: most of the time was spent calculating logarithms. Just logarithms, over and over again. We replaced the built-in logarithm with Apache Commons Math‘s FastMath logarithm, but it didn’t help. We tried some other existing libraries, but none of them had the exact features that we needed. So, naturally, we had to write our own library.

With that having been decided, we ran into some design issues. We wanted our code to be idiomatic, like in Python’s NumPy, but we also wanted blazing-fast performance like what can be achieved with C++. Oh, and we also needed it in a JVM-compatible language, since our project was JVM-based. It took a little trial-and-error, but eventually,


, a hybrid of C++, Kotlin, and Python, an idiomatic library with a native backend, was released in November 2019.

Incredible stories of (native) optimization

The path ad astra seems to always go per aspera. The following are a few trial-and-error examples, which could be educational, amusing, or both.

At first, we delegated every operation to JNI + SIMD. Then we did the benchmarks, rejoiced at the result, and released the library. However, in our pride, we didn’t think to benchmark the arithmetic operations; we only did mathematics (exponent etc.) and reduction (sum etc.). When we later added the arithmetic benchmarks, we were surprised to see that


performed poorly compared to a naïve loop. A couple of JITWatch sessions later, we learned that even the ancient JDK 1.8 is capable of SIMDizing the most primitive patterns. Like, say, element-wise multiplication of two arrays. Only the JDK calls its native code with much less overhead. Therefore, we dropped the SIMD arithmetic and replaced it with naïve loop arithmetic, and suddenly, performance increased.

At first, we tried to reduce the amount of code by only writing the in-place operations (like




), and then defining the copying operations as copy + in-place. Therefore,

a + b

was defined as

a.copy().apply { this += b }

However, it turned out that copying is stupidly expensive, up to the point that copying takes the same amount of time as the actual calculation. In one of the benchmarks, the system spent 7 ms allocating the new array to hold the results, 10 ms copying the first argument over, and another 10 ms actually performing the addition. So we turned everything around and rewrote each method using a source-destination pattern (in-place methods write back to the source, while copying methods write to a separate destination array), and suddenly, performance increased. (This led to a non-negligible improvement even for computationally expensive operations.)

At first, we wrote our code like this (simplified example):

val data: DoubleArray
// ...
operator fun plusAssign(other: F64Array) {
    for (i in 0 until size) {
        data[i] +=[i]

However, it turned out that even the final references are not extracted, and the JVM invokes


two times per loop iteration. We wrote instead:

operator fun plusAssign(other: F64Array) {
    val dst = data
    val src =
    for (i in 0 until size) {
        dst[i] += src[i]

and suddenly, performance increased.


We have run a series of benchmarks on three JDKs: Oracle JDK 1.8, Oracle JDK 15, and GraalVM 20.3 (which implements OpenJDK 11). Each benchmark measured the performance of an operation using built-in JVM features (baseline) and using


methods. Each benchmark was run for different array sizes, from 1000 to 10 million elements.

The following plots show the performance gain (or loss) of


over the baseline for different operation groups.

Unary operations include exponent, logarithm, and their specialized versions:

expm1 = exp(x) – 1

log1p = log(1 + x)

On the graph above, all the lines are well above 1, which means that the


implementations are much, much faster than their JVM counterparts. The effect is most pronounced on the older JDK 1.8 with performance gain of up to 15x, but is still significant on the more modern ones, too.

Binary operations include the usual arithmetic ones (addition and multiplication) and our specialized log-storage addition operation

log-add-exp = log(exp(a) + exp(b))

The performance of the basic arithmetic operations is very close to the baseline. This is owing to the fact that even the ancient JDK 1.8 is able to SIMDize them natively. The log-storage addition naturally benefits from the exponent and logarithm speedup seen on the previous plot.

Reduction operations include the sum of all elements, the specialized log-storage sum

log-sum-exp = log(∑iexp(xi))

and the scalar product of two vectors (


). After the blazing-fast unary operations, this might not seem like much, but the 5x speedup for


and 3x one for


are not insignificant either.


Our library provides significant speedup for various array operations on multiple platforms. It also offers built-in log-storage support. We’d love to have new users and we welcome your feedback (just email Aleksei Dievskii directly, or use GitHub Issues).

Continue Reading viktor: Efficient Vectorized Computations in Kotlin

Multik: Multidimensional Arrays in Kotlin

A lot of data-heavy tasks, as well as optimization problems, boil down to performing computations over multidimensional arrays. Today we’d like to share with you the first preview of a library that aims to serve as a foundation for such computations – Multik.

Multik offers both multidimensional array data structures and implementations of mathematical operations over them. The library has a simple and straightforward API and offers optimized performance.

Using Multik

Without further ado, here are some of the things you can do with Multik.

Create multidimensional arrays

Create a vector:

val a = mk.ndarray(mk[1, 2, 3])
/* [1, 2, 3] */

Create a vector from a collection:

val myList = listOf(1, 2, 3)
val a = mk.ndarray(myList)
/* [1, 2, 3] */

Create a matrix (two-dimensional array):

val m = mk.ndarray(mk[myList, myList])
[[1, 2, 3],
[1, 2, 3]]

Create a fixed-shape array of zeros:

mk.empty<Double, D2>(3, 4)
[[0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0]]

Create an identity matrix (ones on the diagonal, the rest is set to 0)

val e = mk.identity<Double>(3) // create an identity array of shape (3, 3)
[[1.0, 0.0, 0.0],
[0.0, 1.0, 0.0],
[0.0, 0.0, 1.0]]

Create a 3-dimensional array (multik supports up to 4 dimensions):

mk.d3array(2, 2, 3) { it * it } 
[[[0, 1, 4],
[9, 16, 25]],

[[36, 49, 64],
[81, 100, 121]]]

Perform mathematical operations over multidimensional arrays

val a = mk.ndarray(mk[mk[1.0,2.0], mk[3.0,4.0]])
val b = mk.identity<Double>(2)

a + b
[[2.0, 2.0],
[3.0, 5.0]]
a - b
[[0.0, 2.0],
[3.0, 3.0]]
b / a
[[1.0, 0.0],
[0.0, 0.25]]
b * a 
[[1.0, 0.0],
[0.0, 4.0]]

Element-wise mathematical operations

mk.math.sin(a) // element-wise sin 
mk.math.cos(a) // element-wise cos
mk.math.log(b) // element-wise natural logarithm
mk.math.exp(b) // element-wise exp, b) // dot product

Aggregate functions

mk.math.sum(a) // array-wise sum
mk.math.min(b) // array-wise minimum elements
mk.math.cumSum(b, axis=1) // cumulative sum of the elements
mk.stat.mean(a) // mean
mk.stat.median(b) // median

Iterable operations

a.filter { it > 3 } // select all elements that are larger than 3 { (it * it).toInt() } // return squares
a.groupNdarrayBy { it % 2 } // group elements by condition
a.sorted() // sort elements


a[2]  // select the element at the 2 index for a vector
b[1, 2] // select the element at row 1 column 2
b[1] // select row 1 
b[0.r..2, 1] // select elements at rows 0 and 1 in column 1
b[0..1..1] // select all elements at row 0
for (el in b) {
    print("$el, ") // 1.5, 2.1, 3.0, 4.0, 5.0, 6.0, 
// for n-dimensional
val q = b.asDNArray()
for (index in q.multiIndices) {
    print("${q[index]}, ") // 1.5, 2.1, 3.0, 4.0, 5.0, 6.0, 

Multik Architecture

Initially, we attempted to add Kotlin bindings to existing solutions, such as NumPy. However, this proved cumbersome and introduced unnecessary environmental complexity while providing little benefit to justify the overhead. As a result, we have abandoned that approach and started Multik from scratch.

In Multik, the data structures are separate from the implementation of operations over them, and you need to add them as individual dependencies to your project. This approach gives you a consistent API no matter what implementation you decide to use in your project. So what are these different implementations?

Currently, there are three different ones:

  • multik-jvm

    : a Kotlin/JVM implementation of the math operations.

  • multik-native

    : a C++ implementation. OpenBLAS is used for linear algebra.

  • multik-default

    : the default implementation, which combines native and JVM implementations for optimal performance.

You can also write your own!

Multik is still in the early stages of development, and we are looking forward to your feedback, feature requests, and contributions! Check out the project’s GitHub repo, try Multik, and let us know what you’d like to see in future versions. Thanks!

Continue Reading Multik: Multidimensional Arrays in Kotlin

Lets-Plot, in Kotlin

You can understand a lot about data from metrics, checks, and basic statistics. However, as humans, we grasp trends and patterns way quicker when we see them with our own eyes. If there was ever a moment you wished you could easily and quickly visualize your data, and you were not sure how to do it in Kotlin, this post is for you!

Today I’d like to talk to you about Lets-Plot for Kotlin, an open-source plotting library for statistical data written entirely in Kotlin. You’ll learn about its API, the kinds of plots you can build with it, and what makes this library unique. Let’s start with the API.

ggplot-like API

Lets-Plot Kotlin API is built with layered graphic principles in mind. You may be familiar with this approach if you have ever used the ggplot2 package for R.

“This grammar […] is made up of a set of independent components that can be composed in many different ways. This makes [it] very powerful because you are not limited to a set of pre-specified graphics, but you can create new graphics that are precisely tailored for your problem.” Hadley Wickham, ggplot2: Elegant Graphics for Data Analysis

If you have worked with ggplot2 before, you may recognize the API’s style:

If not, let’s unpack what’s going on here. In Lets-Plot, a plot is represented by at least one layer. Layers are responsible for creating the objects painted on the ‘canvas’ and contain the following elements:

  • Data – the subset of data specified either once for all layers or on a per-layer basis. One plot can combine multiple different datasets (one per layer).
  • Aesthetic mapping – describes how variables in the dataset are mapped to the visual properties of the layer, such as color, shape, size, or position.
  • Geometric object – a geometric object that represents a particular type of chart.
  • Statistical transformation – computes some kind of statistical summary on the raw input data. For example,


    statistic is used for histograms and smooth is used for regression lines.

  • Position adjustment – a method used to compute the final coordinates of geometry. Used to build variants of the same geometric object or to avoid overplotting.

To combine all these parts together, you need to use the following simple formula:

p = lets_plot(<dataframe>) 
p + geom_<chart_type>(stat=<stat>, position=<adjustment>) { <aesthetics mapping> }

You can learn more about the Lets-Plot basics and get a better understanding of what the individual building blocks do by checking out the Getting Started Guide.

Customizable plots

Out of the box, Lets-Plot supports numerous visualization types – histograms, box plots, scatter plots, line plots, contour plots, maps, and more!

All of the plots are flexible and highly customizable, yet the library manages to keep the balance between powerful customization capabilities and ease of use. You can start with simple but useful visualizations like data distribution:

Histogram plot

You have all the tools you need to create complex and nuanced visualizations, like this plot illustrating custom tooltips on a plot for Iris dataset:

Customisable tooltips

Check out these tutorials to explore the available Lets-Plot visualizations and learn how to use them:

Integration with the Kotlin kernel for Jupyter Notebook

You may have noticed from the screenshots that these plots were created in Jupyter Notebook. Indeed, Lets-Plot integrates with the Kotlin kernel for Jupyter Notebook out of the box. If you have the Kotlin kernel installed (see the instructions on how to do so), all you need to do to start plotting is add the following line magic in your notebook:

%use lets-plot

That’s it! Plot away 🙂
Kotlin notebooks are also supported in JetBrains Datalore, an Online Data Science Notebook with smart coding assistance. Check out an example Datalore notebook that uses Lets-Plot.

Lets-Plot Internals

Finally, I wanted to share with you a little bit about the implementation of Lets-Plot, because it is a one-of-a-kind multiplatform library. Due to the unique multiplatform nature of Kotlin, the plotting functionality is written once in Kotlin and can then be packaged as a JavaScript library, JVM library, and a native Python extension.

Lets-Plot Internals

Whichever environment you prefer, you can use the same functionality and API to visualize your data with Lets-Plot!

The Kotlin API is built on top of the JVM jar, however, you can also use the JVM jar independently. For instance, you can embed the plots into a JVM application using either JavaFX or Apache Batik SVG Toolkit for graphics rendering.

Lets-Plot truly is an amazing example of Kotlin’s multiplatform potential and a great visualization tool for your data needs. I hope this post has sparked your interest and you’ll give it a go!

Continue Reading Lets-Plot, in Kotlin

Deep Learning With Kotlin: Introducing KotlinDL-alpha

Hi folks!
Today we would like to share with you the first preview of KotlinDL (v.0.1.0), a high-level Deep Learning framework written in Kotlin and inspired by Keras. It offers simple APIs for building, training, and deploying deep learning models in a JVM environment. High-level APIs and sensible defaults for many parameters make it easy to get started with KotlinDL. You can create and train your first simple neural network with a only a few lines of Kotlin code:

private val model = Sequential.of(
    Input(28, 28, 1),

fun main() {
    val (train, test) = Dataset.createTrainAndTestDatasets(
        trainFeaturesPath = "datasets/mnist/train-images-idx3-ubyte.gz",
        trainLabelsPath = "datasets/mnist/train-labels-idx1-ubyte.gz",
        testFeaturesPath = "datasets/mnist/t10k-images-idx3-ubyte.gz",
        testLabelsPath = "datasets/mnist/t10k-labels-idx1-ubyte.gz",
        numClasses = 10,
    val (newTrain, validation) = train.split(splitRatio = 0.95)

    model.use {
            optimizer = Adam(),
            loss = Losses.SOFT_MAX_CROSS_ENTROPY_WITH_LOGITS,
            metric = Metrics.ACCURACY

            dataset = newTrain,
            epochs = 10,
            batchSize = 100,
            verbose = false

        val accuracy = it.evaluate(
            dataset = validation,
            batchSize = 100

        println("Accuracy: $accuracy")"src/model/my_model"))

GPU support

Training deep learning models can be resource-heavy, and you may wish to accelerate the process by running it on a GPU. This is easily achievable with KotlinDL!
With just one additional dependency, you can run the above code without any modifications on an NVIDIA GPU device.

Rich API

KotlinDL comes with all the necessary APIs for building and training feedforward neural networks, including Convolutional Neural Networks. It provides reasonable defaults for most hyperparameters and offers a wide range of optimizers, weight initializers, activation functions, and all the other necessary levers for you to tweak your model.
With KotlinDL, you can save the resulting model, and import it for inference in your JVM backend application.

Keras models import

Out of the box, KotlinDL offers APIs for building, training, saving deep learning models, and loading them to run inference. When importing a model for inference, you can use a model trained with KotlinDL, or you can import a model trained in Python with Keras (versions 2.*).

For models trained with KotlinDL or Keras, KotlinDL supports transfer learning methods that allow you to make use of an existing pre-trained model and fine-tune it to your task.

Temporary limitations

In this first alpha release, only a limited number of layers are available. These are:












, and


. This limitation means that not all Keras models are currently supported. You can import and fine-tune a pre-trained VGG-16 or VGG-19 model, but not, for example, a ResNet50 model. We are working hard on bringing more layers for you in the upcoming releases.

Another temporary limitation concerns deployment. You can deploy a model in a server-side JVM environment, however, inference on Android devices is not yet supported, but it is coming in later releases.

What’s under the hood?

KotlinDL is built on top of the TensorFlow Java API which is being actively developed by the open source community.

Give it a try!

We’ve prepared some tutorials to help you get started with KotlinDL:

Feel free to share your feedback through GitHub issues, create your own pull requests, and join the #deeplearning community on Kotlin slack.

Continue Reading Deep Learning With Kotlin: Introducing KotlinDL-alpha

Introducing Kotlin for Apache Spark Preview

Apache Spark is an open-source unified analytics engine for large-scale distributed data processing. Over the last few years, it has become one of the most popular tools used for processing large amounts of data. It covers a wide range of tasks – from data batch processing and simple ETL (Extract/Transform/Load) to streaming and machine learning.

Due to Kotlin’s interoperability with Java, Kotlin developers can already work with Apache Spark via Java API. This way, however, they cannot use Kotlin to its full potential, and the general experience is far from smooth.

Today, we are happy to share the first preview of the Kotlin API for Apache Spark. This project adds a missing layer of compatibility between Kotlin and Apache Spark. It allows you to write idiomatic Kotlin code using familiar language features such as data classes and lambda expressions.

Kotlin for Apache Spark also extends the existing APIs with a few nice features.

withSpark and withCached functions


is a simple and elegant way to work with SparkSession that will automatically take care of calling


at the end of the block for you.
You can pass parameters to it that may be required to run Spark, such as master location, log level, or app name. It also comes with a convenient set of defaults for running Spark locally.

Here’s a classic example of counting occurrences of letters in lines:

val logFile = "a/path/to/logFile.txt" 
withSpark(master = "yarn", logLevel = SparkLogLevel.DEBUG){ {
       val numAs = filter { it.contains("a") }.count()
       val numBs = filter { it.contains("b") }.count()
       println("Lines with a: $numAs, lines with b: $numBs")

Another useful function in the example above is


. In other APIs, if you want to fork computations into several paths, but compute things only once, you would call the ‘cache’ method. However, this quickly becomes difficult to track and you have to remember to unpersist the cached data. Otherwise, you risk taking up more memory than intended or even breaking things altogether.


takes care of tracking and unpersisting for you.

Null safety

Kotlin for Spark adds




, and other aliases to the existing methods, however, these are null safe by design.

fun main() {

    data class Coordinate(val lon: Double, val lat: Double)
    data class City(val name: String, val coordinate: Coordinate)
    data class CityPopulation(val city: String, val population: Long)

    withSpark(appName = "Find biggest cities to visit") {
        val citiesWithCoordinates = dsOf(
                City("Moscow", Coordinate(37.6155600, 55.7522200)),
                // ...

        val populations = dsOf(
                CityPopulation("Moscow", 11_503_501L),
                // ...
        citiesWithCoordinates.rightJoin(populations, citiesWithCoordinates.col("name") `==` populations.col("city"))
                .filter { (_, citiesPopulation) ->
                    citiesPopulation.population > 15_000_000L
                .map { (city, _) ->
                    // A city may potentially be null in this right join!!!

Note the


line in the example above. A city may potentially be null in this right join. This would’ve caused a


in other JVM Spark APIs, and it would’ve been rather difficult to debug the source of the problem.
Kotlin for Apache Spark takes care of null safety for you and you can conveniently filter out null results.

What’s supported

This initial version of Kotlin for Apache Spark supports Apache Spark 3.0 with the core compiled against Scala 2.12.

The API covers all the methods needed for creating self-contained Spark applications best suited for batch ETL.

Getting started with Kotlin for Apache Spark

To help you quickly get started with Kotlin for Apache Spark, we have prepared a Quick Start Guide that will help you set up the environment, correctly define dependencies for your project, and run your first self-contained Spark application written in Kotlin.

What’s next

We understand that it takes a while to upgrade any existing framework to a newer version, and Spark is no exception. That is why in the next update we are going to add support for the earlier Spark versions: 2.4.2 – 2.4.6.

We are also working on the Kotlin Spark shell so that you can enjoy working with your data in an interactive manner, and perform exploratory data analysis with it.

Currently, Spark Streaming and Spark MLlib are not covered by this API, but we will be closely listening to your feedback and will address it in our roadmap accordingly.

In the future, we hope to see Kotlin join the official Apache Spark project as a first-class citizen. We believe that it can add value both for Kotlin, and for the Spark community. That is why we have opened a Spark Project Improvement Proposal: Kotlin support for Apache Spark. We encourage you to voice your opinions and join the discussion.

Go ahead and try Kotlin for Apache Spark and let us know what you think!

Continue Reading Introducing Kotlin for Apache Spark Preview

End of content

No more pages to load