… or about the benefits of reinventing the wheel

Intro

It all began with the fact that I’ve decided to study k-means clustering algorithm. The more resources I read, the more confused . The definition of cluster centers has been stumbling-block. The most accessible explanation was provided by Wikipedia:

Assign each observation to the cluster whose mean has the least squared Euclidean distance.

But how does this happen? So I’ve decided to reinvent a wheel and reproduce step by step all stages of the algorithm. As a basis I’ve decided to take an example from the book by Roger D. Peng Exploratory Data Analysis with R.

Because I’m a passionate fan of R I’ve decided to use fantastic package tydiverse to reproduce algorithm and gganimate to visualize result.
So, let’s begin!

1. Dive into k-means

Data preparation

I’ve decided to reproduce dataset that is provided by Roger D.Peng (Figure 12.1: Simulated dataset)

set.seed(1234)
x <- c(0.75, 1.1, 1.25, 0.55, 2.15, 2.17, 1.85, 1.86, 2.87, 2.75, 2.91, 2.71)
y <- c(0.6, 1.0, 1.25, 1.0, 1.9, 1.8, 1.85, 2.5, 1.0, 0.85, 0.86, 1.18)
df_xy <- tibble(x, y) %>% rowid_to_column()
# Length of dataset
n_xy <- dim(df_xy)[1]

Let’s take three clusters for example

Cx = c(1, 1.7, 2.5)
Cy = c(2, 1.0, 1.5)
# Number of clusters define as length of Cx or Cy vectors
n_clust <- length(Cx)
df_cl <- tibble(Cluster = factor(seq(n_clust), ordered = T), Cx, Cy)

Next create a dataset with coordinates of observations and coordinates of cluster centers

df_kmeans0 <- bind_cols(df_xy, map_dfr(df_cl, rep, length.out = length(x)))

Here’s how all this looks like at the initial stage

ggplot(df_kmeans0, aes(x, y)) +
  geom_point(shape = 1, size = 5, stroke = 1) +
  geom_text(aes(x, y, label = rownames(df_xy)), vjust = 2.0, size = 4, color = "black") + 
  geom_point(aes(Cx, Cy, color = Cluster), shape = 3, size = 5, stroke = 1) +
  coord_equal() +
  labs(title = "The initial location of the cluster centers") +
  theme(plot.title = element_text(hjust = 0.5))

Now let’s look what’s happen under the hood of k-means

Step 1. Calculate squared Euclidean distance between each observations and each clusters

Firs of all we need to choose a metric for measuring distance between observations and cluster’s centers. Let it be Euclidean distance or rather squared Euclidean distance.

Eu_dist2 = (xi - xc)2 + (yi - yc)2

Next, it needs to determine the distance from each point to each cluster. Since we have 12 points and 3 clusters, we get 36 distances. It was the first stumbling-block.

df_dist <- bind_cols(map_dfr(df_xy, rep, n_clust),
                     map_dfr(df_cl, rep, each = n_xy)) %>% 
  mutate(Eu_dist = sqrt((.$x - .$Cx)^2 + (.$y - .$Cy)^2))
df_dist

Step.2 Assign each observation to the cluster whose mean has the least squared Euclidean distance

step2 <- df_dist %>% 
  select(-Cx, -Cy) %>% 
  spread(Cluster, Eu_dist) %>% 
  nest(-rowid, -x, -y) %>%
  mutate(Eu_dist = map_dbl(.$data, min),
         Cluster = map_dbl(.$data, which.min) %>% factor(., ordered = T)) %>%
  unnest(data)
step2

Columns 1-3 represent all (3 in our case) distances of observation to each cluster. We need to choose the minimum value for each observation. They are gathered in Eu_dist column. So, the columns Eu_dist and Cluster demonstrate which observations to which clusters belong. It was the second and the last stumbling-block.

Step 3. Join cluster’s coordinates to previous dataset

Now it remains to add the coordinates of cluster centers to our previous data set

step2 %>%
  left_join(df_cl, by = 'Cluster') -> step3
step3

Let’s demonstrate result

ggplot(step3, aes(x, y, color = Cluster)) +
  geom_point(size = 5) +
  geom_point(aes(x = step3$Cx, y = step3$Cy), shape = 3, size = 7) +
  geom_text(label = rownames(step3), vjust = 1.8, size = 4, color = 'black') +
  coord_fixed()

Step 4. Recalculation center’s coordinates of new clusters

step3 %>% 
  select(-num_range("", 1:n_clust)) %>% 
  group_by(Cluster) %>% 
  mutate(Cx2 = mean(x),
         Cy2 = mean(y)) -> step4
step4

Let’s look at new clusters (smaller size - are initial values)

step4 %>% 
  ggplot(aes(x, y, color = Cluster)) +
    geom_point(size = 5) +
    geom_point(aes(x = step4$Cx, y = step4$Cy), shape = 3, size = 3) +
    geom_point(aes(x = step4$Cx2, y = step4$Cy2), shape = 3, size = 6) +
    geom_text(aes(label = rownames(step4)), vjust = 1.8, size = 4, color = 'black') 

Put our dataset with new clusters into “canonical” form

step4 %>% 
  select(Cluster, Cx2, Cy2) %>% 
  group_by(Cluster) %>% 
  unique(.) %>% 
  arrange(Cluster) -> new_iteration
new_iteration

Next - repeat iteration from step 1 and so on, but for this purpose it needs to have something like recursive function or even two!. I’ve created k_means() function that execute k-means algorithm and wrapper_kmeans() function for calling k_means() in the best way. These function and couple others are contained in helper_functions.R file.
It must be noted, at this stage I’ve became aware of the principle of k-means, but I wanted to see the result for several iterations and compare one with a regular algorithm - the kmeans() function.
So, let’s keep on!

2. k-means from a bird’s eye view

Set number of iteration and pass it with observation (df_xy) and cluster (df_cl) tibbles to wraper_kmeans(). It return list of tibbles with observations and clusters from k_means() function from all iterations.

n_iter <- 5 # nubmer of k-means iterations
df_kmeans <- wraper_kmeans(df_xy, df_cl, n_iter)
df_kmeans[[n_iter]]

Also we can view main results of k_means algorithm for chosen iteration

my_kmeans <- get_kmeans(n_iter, df_kmeans)
my_kmeans

… and compare it this result of build-in k-means() function

And how about visualization? And it is desirable with animation!
Yes of course! But first make “long” dataframe from the df_kmeans list:

df_long <- map_dfr(df_kmeans, ~ .)

Ta da!

theme_ani <- theme(
          plot.title = element_text(size = 14, face = "bold", hjust = 0.5),
          legend.title = element_text(face = "bold", colour = "black", size = 12),
          legend.text = element_text(size = 11, color = "black"),
          axis.text = element_text(size = 11, colour = "black"),
          axis.title = element_text(size = 12, face = "bold"))
ggplot(df_long, aes(x, y, color = Cluster, frame = Iteration)) +
  geom_point(aes(group = Cluster), size = 5) +
  geom_point(aes(x = Cx, y = Cy), shape = 3, size = 5, color = "black", stroke = 1) +
  geom_path(aes(x = Cx, y = Cy, cumulative = TRUE, group = Cluster),
            color = "black") +
  geom_text(aes(x, y, label = rowid), vjust = 1.8, size = 4, color = 'black') +
  geom_polygon(data = df_polygon(df_kmeans),
               aes(x, y, group = Cluster, fill = Cluster), alpha = .5) +
  coord_equal() +
  labs(title = "How k-means work:") +
  theme_ani -> gg
gganimate(gg)
ggplot(df_long, aes(x = rowid, y = Eu_dist, frame = Iteration, cumulative = TRUE)) +
  geom_point(aes(color = Iteration), size = 5) +
  geom_line(aes(color = Iteration), size = 1) +
  scale_x_continuous(breaks = 1:dim(df_xy)[1]) +
  labs(x = "Observations", y = "Euclidean distance squared",
       title = "Squared Euclidean distance for each observation on:") +
  theme_ani -> gg_dist
gganimate(gg_dist)
df_long %>% 
  group_by(Iteration) %>% 
  summarise(tot_sum = sum(Eu_dist)) %>% 
  ungroup() %>% 
  ggplot(aes(x = Iteration, y = tot_sum, frame = Iteration, cumulative = TRUE)) +
    geom_bar(stat = 'identity', aes(fill = Iteration)) +
    labs(x = NULL, title = "Total within-cluster sum of square on:") +
    theme_ani -> gg_tot_sum
gganimate(gg_tot_sum)

Comment#1

By the end I want to mention some things.
This is a demonstration example, so the k-means algorithm implemented here may be far from effective. Because of this, I do not recommend using it with large datasets and a large number of iterations (> 10). Of course, you can use your own dataset if you are interested.

Comment#2

So, what have I achieved by reinventing the wheel:
1. Understood the k-means algorithm, of course)
2. Improved my skills in tydiverse
3. Learned to write recursive functions that return two values, one of which is necessary for recursion work, and the other - for external needs (this is what wraper_kmeans() return). This was achieved by the additional wrapper function and the concept of environments.
4. Learned to make animated charts with gganimate.
5. Examined chull function from base R
6. Learned how to hide comments in Rmarkdown

I’d be appriciate for your critical remarks and comments. Well, of course I will be very-very-very happy if this material helps you understand k-means as it helped me! You can find code here

LS0tCnRpdGxlOiAiay1tZWFucyBleHBsYW5hdGlvbiIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQojIyMgLi4uIG9yIGFib3V0IHRoZSBiZW5lZml0cyBvZiByZWludmVudGluZyB0aGUgd2hlZWwgICAKCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobyA9IFRSVUUsCiAgICAgICAgICAgICAgICAgICAgICBmaWcud2lkdGggPSA4KQoKbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkoZ2dhbmltYXRlKQpsaWJyYXJ5KGFuaW1hdGlvbikKCnNvdXJjZSgiaGVscGVyX2Z1bmN0aW9ucy5SIikKCmFuaS5vcHRpb25zKGFuaS53aWR0aCA9IDc4MCkKYGBgCgojIyMgKipJbnRybyoqCgpJdCBhbGwgYmVnYW4gd2l0aCB0aGUgZmFjdCB0aGF0IEkndmUgZGVjaWRlZCB0byBzdHVkeSBrLW1lYW5zIGNsdXN0ZXJpbmcgYWxnb3JpdGhtLiBUaGUgbW9yZSByZXNvdXJjZXMgSSByZWFkLCB0aGUgbW9yZSBjb25mdXNlZCAuIFRoZSAgZGVmaW5pdGlvbiBvZiBjbHVzdGVyIGNlbnRlcnMgaGFzIGJlZW4gc3R1bWJsaW5nLWJsb2NrLiBUaGUgbW9zdCBhY2Nlc3NpYmxlIGV4cGxhbmF0aW9uIHdhcyBwcm92aWRlZCBieSBbV2lraXBlZGlhXShodHRwczovL2VuLndpa2lwZWRpYS5vcmcvd2lraS9LLW1lYW5zX2NsdXN0ZXJpbmcpOgoKPiBBc3NpZ24gZWFjaCBvYnNlcnZhdGlvbiB0byB0aGUgY2x1c3RlciB3aG9zZSBtZWFuIGhhcyB0aGUgbGVhc3Qgc3F1YXJlZCBFdWNsaWRlYW4gZGlzdGFuY2UuCgpCdXQgaG93IGRvZXMgdGhpcyBoYXBwZW4/IFNvIEkndmUgZGVjaWRlZCB0byByZWludmVudCBhIHdoZWVsIGFuZCByZXByb2R1Y2Ugc3RlcCBieSBzdGVwIGFsbCBzdGFnZXMgb2YgdGhlIGFsZ29yaXRobS4KQXMgYSBiYXNpcyBJJ3ZlIGRlY2lkZWQgdG8gdGFrZSBhbiBleGFtcGxlIGZyb20gdGhlIGJvb2sgYnkgUm9nZXIgRC4gUGVuZyBbRXhwbG9yYXRvcnkgRGF0YSBBbmFseXNpcyB3aXRoIFJdKGh0dHBzOi8vYm9va2Rvd24ub3JnL3JkcGVuZy9leGRhdGEvay1tZWFucy1jbHVzdGVyaW5nLmh0bWwpLgoKQmVjYXVzZSBJJ20gYSBwYXNzaW9uYXRlIGZhbiBvZiBSIEkndmUgZGVjaWRlZCB0byB1c2UgZmFudGFzdGljIHBhY2thZ2UgW3R5ZGl2ZXJzZV0oaHR0cHM6Ly93d3cudGlkeXZlcnNlLm9yZy8pIHRvIHJlcHJvZHVjZSBhbGdvcml0aG0gYW5kIFtnZ2FuaW1hdGVdKGh0dHBzOi8vZ2l0aHViLmNvbS90aG9tYXNwODUvZ2dhbmltYXRlKSB0byB2aXN1YWxpemUgcmVzdWx0LiAgClNvLCBsZXQncyBiZWdpbiEKCiMjIyAqKjEuIERpdmUgaW50byBrLW1lYW5zKioKCiMjIyMgKipEYXRhIHByZXBhcmF0aW9uKioKSSd2ZSBkZWNpZGVkIHRvIHJlcHJvZHVjZSBkYXRhc2V0IHRoYXQgaXMgcHJvdmlkZWQgYnkgUm9nZXIgRC5QZW5nIChGaWd1cmUgMTIuMTogU2ltdWxhdGVkIGRhdGFzZXQpIVtdKHNpbXVsYXRlZF9kYXRhc2V0LnBuZykKYGBge3J9CnNldC5zZWVkKDEyMzQpCnggPC0gYygwLjc1LCAxLjEsIDEuMjUsIDAuNTUsIDIuMTUsIDIuMTcsIDEuODUsIDEuODYsIDIuODcsIDIuNzUsIDIuOTEsIDIuNzEpCnkgPC0gYygwLjYsIDEuMCwgMS4yNSwgMS4wLCAxLjksIDEuOCwgMS44NSwgMi41LCAxLjAsIDAuODUsIDAuODYsIDEuMTgpCmRmX3h5IDwtIHRpYmJsZSh4LCB5KSAlPiUgcm93aWRfdG9fY29sdW1uKCkKIyBMZW5ndGggb2YgZGF0YXNldApuX3h5IDwtIGRpbShkZl94eSlbMV0KYGBgCkxldCdzIHRha2UgdGhyZWUgY2x1c3RlcnMgZm9yIGV4YW1wbGUKYGBge3J9CkN4ID0gYygxLCAxLjcsIDIuNSkKQ3kgPSBjKDIsIDEuMCwgMS41KQojIE51bWJlciBvZiBjbHVzdGVycyBkZWZpbmUgYXMgbGVuZ3RoIG9mIEN4IG9yIEN5IHZlY3RvcnMKbl9jbHVzdCA8LSBsZW5ndGgoQ3gpCmRmX2NsIDwtIHRpYmJsZShDbHVzdGVyID0gZmFjdG9yKHNlcShuX2NsdXN0KSwgb3JkZXJlZCA9IFQpLCBDeCwgQ3kpCmBgYApOZXh0IGNyZWF0ZSBhIGRhdGFzZXQgd2l0aCBjb29yZGluYXRlcyBvZiBvYnNlcnZhdGlvbnMgYW5kIGNvb3JkaW5hdGVzIG9mIGNsdXN0ZXIgY2VudGVycwpgYGB7cn0KZGZfa21lYW5zMCA8LSBiaW5kX2NvbHMoZGZfeHksIG1hcF9kZnIoZGZfY2wsIHJlcCwgbGVuZ3RoLm91dCA9IGxlbmd0aCh4KSkpCmBgYApIZXJlJ3MgaG93IGFsbCB0aGlzIGxvb2tzIGxpa2UgYXQgdGhlIGluaXRpYWwgc3RhZ2UgIApgYGB7cn0KZ2dwbG90KGRmX2ttZWFuczAsIGFlcyh4LCB5KSkgKwogIGdlb21fcG9pbnQoc2hhcGUgPSAxLCBzaXplID0gNSwgc3Ryb2tlID0gMSkgKwogIGdlb21fdGV4dChhZXMoeCwgeSwgbGFiZWwgPSByb3duYW1lcyhkZl94eSkpLCB2anVzdCA9IDIuMCwgc2l6ZSA9IDQsIGNvbG9yID0gImJsYWNrIikgKyAKICBnZW9tX3BvaW50KGFlcyhDeCwgQ3ksIGNvbG9yID0gQ2x1c3RlciksIHNoYXBlID0gMywgc2l6ZSA9IDUsIHN0cm9rZSA9IDEpICsKICBjb29yZF9lcXVhbCgpICsKICBsYWJzKHRpdGxlID0gIlRoZSBpbml0aWFsIGxvY2F0aW9uIG9mIHRoZSBjbHVzdGVyIGNlbnRlcnMiKSArCiAgdGhlbWUocGxvdC50aXRsZSA9IGVsZW1lbnRfdGV4dChoanVzdCA9IDAuNSkpCmBgYAoKTm93IGxldCdzIGxvb2sgd2hhdCdzIGhhcHBlbiB1bmRlciB0aGUgaG9vZCBvZiBrLW1lYW5zCgojIyMjICoqU3RlcCAxLiBDYWxjdWxhdGUgc3F1YXJlZCBFdWNsaWRlYW4gZGlzdGFuY2UgYmV0d2VlbiBlYWNoIG9ic2VydmF0aW9ucyBhbmQgZWFjaCBjbHVzdGVycyoqCkZpcnMgb2YgYWxsIHdlIG5lZWQgdG8gY2hvb3NlIGEgbWV0cmljIGZvciBtZWFzdXJpbmcgZGlzdGFuY2UgYmV0d2VlbiBvYnNlcnZhdGlvbnMgYW5kIGNsdXN0ZXIncyBjZW50ZXJzLiBMZXQgaXQgYmUgRXVjbGlkZWFuIGRpc3RhbmNlCm9yIHJhdGhlciBzcXVhcmVkIEV1Y2xpZGVhbiBkaXN0YW5jZS4KCj4gRXVfZGlzdF4yXiA9ICh4fml+IC0geH5jfileMl4gKyAoeX5pfiAtIHl+Y34pXjJeCgpOZXh0LCBpdCBuZWVkcyB0byBkZXRlcm1pbmUgdGhlIGRpc3RhbmNlIGZyb20gZWFjaCBwb2ludCB0byBlYWNoIGNsdXN0ZXIuIFNpbmNlIHdlIGhhdmUgMTIgcG9pbnRzIGFuZCAzIGNsdXN0ZXJzLCB3ZSBnZXQgMzYgZGlzdGFuY2VzLiBJdCB3YXMgdGhlIGZpcnN0IHN0dW1ibGluZy1ibG9jay4KYGBge3J9CmRmX2Rpc3QgPC0gYmluZF9jb2xzKG1hcF9kZnIoZGZfeHksIHJlcCwgbl9jbHVzdCksCiAgICAgICAgICAgICAgICAgICAgIG1hcF9kZnIoZGZfY2wsIHJlcCwgZWFjaCA9IG5feHkpKSAlPiUgCiAgbXV0YXRlKEV1X2Rpc3QgPSBzcXJ0KCguJHggLSAuJEN4KV4yICsgKC4keSAtIC4kQ3kpXjIpKQpkZl9kaXN0CmBgYAojIyMjICoqU3RlcC4yIEFzc2lnbiBlYWNoIG9ic2VydmF0aW9uIHRvIHRoZSBjbHVzdGVyIHdob3NlIG1lYW4gaGFzIHRoZSBsZWFzdCBzcXVhcmVkIEV1Y2xpZGVhbiBkaXN0YW5jZSoqCgpgYGB7cn0Kc3RlcDIgPC0gZGZfZGlzdCAlPiUgCiAgc2VsZWN0KC1DeCwgLUN5KSAlPiUgCiAgc3ByZWFkKENsdXN0ZXIsIEV1X2Rpc3QpICU+JSAKICBuZXN0KC1yb3dpZCwgLXgsIC15KSAlPiUKICBtdXRhdGUoRXVfZGlzdCA9IG1hcF9kYmwoLiRkYXRhLCBtaW4pLAogICAgICAgICBDbHVzdGVyID0gbWFwX2RibCguJGRhdGEsIHdoaWNoLm1pbikgJT4lIGZhY3RvciguLCBvcmRlcmVkID0gVCkpICU+JQogIHVubmVzdChkYXRhKQpzdGVwMgpgYGAKQ29sdW1ucyAxLTMgcmVwcmVzZW50IGFsbCAoMyBpbiBvdXIgY2FzZSkgIGRpc3RhbmNlcyBvZiBvYnNlcnZhdGlvbiB0byBlYWNoIGNsdXN0ZXIuIFdlIG5lZWQgdG8gY2hvb3NlIHRoZSBtaW5pbXVtIHZhbHVlIGZvciBlYWNoIG9ic2VydmF0aW9uLiBUaGV5IGFyZSBnYXRoZXJlZCBpbiBFdV9kaXN0IGNvbHVtbi4gU28sIHRoZSBjb2x1bW5zIEV1X2Rpc3QgYW5kIENsdXN0ZXIgZGVtb25zdHJhdGUgd2hpY2ggb2JzZXJ2YXRpb25zIHRvIHdoaWNoIGNsdXN0ZXJzIGJlbG9uZy4gSXQgd2FzIHRoZSBzZWNvbmQgYW5kIHRoZSBsYXN0IHN0dW1ibGluZy1ibG9jay4gCgojIyMjICoqU3RlcCAzLiBKb2luIGNsdXN0ZXIncyBjb29yZGluYXRlcyB0byBwcmV2aW91cyBkYXRhc2V0KioKTm93IGl0IHJlbWFpbnMgdG8gYWRkIHRoZSBjb29yZGluYXRlcyBvZiBjbHVzdGVyIGNlbnRlcnMgdG8gb3VyIHByZXZpb3VzIGRhdGEgc2V0ICAKYGBge3J9CnN0ZXAyICU+JQogIGxlZnRfam9pbihkZl9jbCwgYnkgPSAnQ2x1c3RlcicpIC0+IHN0ZXAzCnN0ZXAzCmBgYApMZXQncyBkZW1vbnN0cmF0ZSByZXN1bHQgIApgYGB7cn0KZ2dwbG90KHN0ZXAzLCBhZXMoeCwgeSwgY29sb3IgPSBDbHVzdGVyKSkgKwogIGdlb21fcG9pbnQoc2l6ZSA9IDUpICsKICBnZW9tX3BvaW50KGFlcyh4ID0gc3RlcDMkQ3gsIHkgPSBzdGVwMyRDeSksIHNoYXBlID0gMywgc2l6ZSA9IDcpICsKICBnZW9tX3RleHQobGFiZWwgPSByb3duYW1lcyhzdGVwMyksIHZqdXN0ID0gMS44LCBzaXplID0gNCwgY29sb3IgPSAnYmxhY2snKSArCiAgY29vcmRfZml4ZWQoKQpgYGAKCiMjIyMgKipTdGVwIDQuIFJlY2FsY3VsYXRpb24gY2VudGVyJ3MgY29vcmRpbmF0ZXMgb2YgbmV3IGNsdXN0ZXJzKioKYGBge3J9CnN0ZXAzICU+JSAKICBzZWxlY3QoLW51bV9yYW5nZSgiIiwgMTpuX2NsdXN0KSkgJT4lIAogIGdyb3VwX2J5KENsdXN0ZXIpICU+JSAKICBtdXRhdGUoQ3gyID0gbWVhbih4KSwKICAgICAgICAgQ3kyID0gbWVhbih5KSkgLT4gc3RlcDQKc3RlcDQKYGBgCkxldCdzIGxvb2sgYXQgbmV3IGNsdXN0ZXJzIChzbWFsbGVyIHNpemUgLSBhcmUgaW5pdGlhbCB2YWx1ZXMpICAKYGBge3J9CnN0ZXA0ICU+JSAKICBnZ3Bsb3QoYWVzKHgsIHksIGNvbG9yID0gQ2x1c3RlcikpICsKICAgIGdlb21fcG9pbnQoc2l6ZSA9IDUpICsKICAgIGdlb21fcG9pbnQoYWVzKHggPSBzdGVwNCRDeCwgeSA9IHN0ZXA0JEN5KSwgc2hhcGUgPSAzLCBzaXplID0gMykgKwogICAgZ2VvbV9wb2ludChhZXMoeCA9IHN0ZXA0JEN4MiwgeSA9IHN0ZXA0JEN5MiksIHNoYXBlID0gMywgc2l6ZSA9IDYpICsKICAgIGdlb21fdGV4dChhZXMobGFiZWwgPSByb3duYW1lcyhzdGVwNCkpLCB2anVzdCA9IDEuOCwgc2l6ZSA9IDQsIGNvbG9yID0gJ2JsYWNrJykgCmBgYAoKUHV0IG91ciBkYXRhc2V0IHdpdGggbmV3IGNsdXN0ZXJzIGludG8gImNhbm9uaWNhbCIgZm9ybSAgCmBgYHtyIHdhcm5pbmc9RkFMU0V9CnN0ZXA0ICU+JSAKICBzZWxlY3QoQ2x1c3RlciwgQ3gyLCBDeTIpICU+JSAKICBncm91cF9ieShDbHVzdGVyKSAlPiUgCiAgdW5pcXVlKC4pICU+JSAKICBhcnJhbmdlKENsdXN0ZXIpIC0+IG5ld19pdGVyYXRpb24KbmV3X2l0ZXJhdGlvbgpgYGAKTmV4dCAtIHJlcGVhdCBpdGVyYXRpb24gZnJvbSBzdGVwIDEgYW5kIHNvIG9uLCBidXQgZm9yIHRoaXMgcHVycG9zZSBpdCBuZWVkcyB0byBoYXZlIHNvbWV0aGluZyBsaWtlIHJlY3Vyc2l2ZSBmdW5jdGlvbiBvciBldmVuIHR3byEuIEkndmUgY3JlYXRlZCBrX21lYW5zKCkgZnVuY3Rpb24gdGhhdCBleGVjdXRlIGstbWVhbnMgYWxnb3JpdGhtIGFuZCB3cmFwcGVyX2ttZWFucygpIGZ1bmN0aW9uIGZvciBjYWxsaW5nIGtfbWVhbnMoKSBpbiB0aGUgYmVzdCB3YXkuIFRoZXNlIGZ1bmN0aW9uIGFuZCBjb3VwbGUgb3RoZXJzIGFyZSBjb250YWluZWQgaW4gaGVscGVyX2Z1bmN0aW9ucy5SIGZpbGUuICAgCkl0IG11c3QgYmUgbm90ZWQsIGF0IHRoaXMgc3RhZ2UgSSd2ZSBiZWNhbWUgYXdhcmUgb2YgdGhlIHByaW5jaXBsZSBvZiBrLW1lYW5zLCBidXQgSSB3YW50ZWQgdG8gc2VlIHRoZSByZXN1bHQgZm9yIHNldmVyYWwgaXRlcmF0aW9ucyBhbmQgY29tcGFyZSBvbmUgd2l0aCBhIHJlZ3VsYXIgYWxnb3JpdGhtIC0gdGhlIGttZWFucygpIGZ1bmN0aW9uLiAgClNvLCBsZXQncyBrZWVwIG9uIQoKIyMjICoqMi4gay1tZWFucyBmcm9tIGEgYmlyZCdzIGV5ZSB2aWV3KioKClNldCBudW1iZXIgb2YgaXRlcmF0aW9uIGFuZCBwYXNzIGl0IHdpdGggb2JzZXJ2YXRpb24gKGRmX3h5KSBhbmQgY2x1c3RlciAoZGZfY2wpIHRpYmJsZXMgdG8gd3JhcGVyX2ttZWFucygpLiBJdCByZXR1cm4gbGlzdCBvZiB0aWJibGVzIHdpdGggb2JzZXJ2YXRpb25zIGFuZCBjbHVzdGVycyBmcm9tIGtfbWVhbnMoKSBmdW5jdGlvbiBmcm9tIGFsbCBpdGVyYXRpb25zLgpgYGB7ciB3YXJuaW5nPUZBTFNFfQpuX2l0ZXIgPC0gNSAjIG51Ym1lciBvZiBrLW1lYW5zIGl0ZXJhdGlvbnMKZGZfa21lYW5zIDwtIHdyYXBlcl9rbWVhbnMoZGZfeHksIGRmX2NsLCBuX2l0ZXIpCmRmX2ttZWFuc1tbbl9pdGVyXV0KYGBgCkFsc28gd2UgY2FuIHZpZXcgbWFpbiByZXN1bHRzIG9mIGtfbWVhbnMgYWxnb3JpdGhtIGZvciBjaG9zZW4gaXRlcmF0aW9uCmBgYHtyIHdhcm5pbmc9RkFMU0UsIHBhZ2VkLnByaW50PUZBTFNFfQpteV9rbWVhbnMgPC0gZ2V0X2ttZWFucyhuX2l0ZXIsIGRmX2ttZWFucykKbXlfa21lYW5zCmBgYAogLi4uIGFuZCBjb21wYXJlIGl0IHRoaXMgcmVzdWx0IG9mIGJ1aWxkLWluIGstbWVhbnMoKSBmdW5jdGlvbgpgYGB7ciBlY2hvPUZBTFNFfQprbWVhbnMoZGZfeHlbLCAtMV0sIGNlbnRlcnMgPSAzLCBpdGVyLm1heCA9IG5faXRlcikKYGBgCkFuZCBob3cgYWJvdXQgdmlzdWFsaXphdGlvbj8gQW5kIGl0IGlzIGRlc2lyYWJsZSB3aXRoIGFuaW1hdGlvbiEgICAKWWVzIG9mIGNvdXJzZSEgQnV0IGZpcnN0IG1ha2UgImxvbmciIGRhdGFmcmFtZSBmcm9tIHRoZSBkZl9rbWVhbnMgbGlzdDoKYGBge3J9CmRmX2xvbmcgPC0gbWFwX2RmcihkZl9rbWVhbnMsIH4gLikKYGBgClRhIGRhIQpgYGB7ciB3YXJuaW5nPUZBTFNFLCBmaWcuc2hvdyA9ICdhbmltYXRlJ30KdGhlbWVfYW5pIDwtIHRoZW1lKAogICAgICAgICAgcGxvdC50aXRsZSA9IGVsZW1lbnRfdGV4dChzaXplID0gMTQsIGZhY2UgPSAiYm9sZCIsIGhqdXN0ID0gMC41KSwKICAgICAgICAgIGxlZ2VuZC50aXRsZSA9IGVsZW1lbnRfdGV4dChmYWNlID0gImJvbGQiLCBjb2xvdXIgPSAiYmxhY2siLCBzaXplID0gMTIpLAogICAgICAgICAgbGVnZW5kLnRleHQgPSBlbGVtZW50X3RleHQoc2l6ZSA9IDExLCBjb2xvciA9ICJibGFjayIpLAogICAgICAgICAgYXhpcy50ZXh0ID0gZWxlbWVudF90ZXh0KHNpemUgPSAxMSwgY29sb3VyID0gImJsYWNrIiksCiAgICAgICAgICBheGlzLnRpdGxlID0gZWxlbWVudF90ZXh0KHNpemUgPSAxMiwgZmFjZSA9ICJib2xkIikpCgpnZ3Bsb3QoZGZfbG9uZywgYWVzKHgsIHksIGNvbG9yID0gQ2x1c3RlciwgZnJhbWUgPSBJdGVyYXRpb24pKSArCiAgZ2VvbV9wb2ludChhZXMoZ3JvdXAgPSBDbHVzdGVyKSwgc2l6ZSA9IDUpICsKICBnZW9tX3BvaW50KGFlcyh4ID0gQ3gsIHkgPSBDeSksIHNoYXBlID0gMywgc2l6ZSA9IDUsIGNvbG9yID0gImJsYWNrIiwgc3Ryb2tlID0gMSkgKwogIGdlb21fcGF0aChhZXMoeCA9IEN4LCB5ID0gQ3ksIGN1bXVsYXRpdmUgPSBUUlVFLCBncm91cCA9IENsdXN0ZXIpLAogICAgICAgICAgICBjb2xvciA9ICJibGFjayIpICsKICBnZW9tX3RleHQoYWVzKHgsIHksIGxhYmVsID0gcm93aWQpLCB2anVzdCA9IDEuOCwgc2l6ZSA9IDQsIGNvbG9yID0gJ2JsYWNrJykgKwogIGdlb21fcG9seWdvbihkYXRhID0gZGZfcG9seWdvbihkZl9rbWVhbnMpLAogICAgICAgICAgICAgICBhZXMoeCwgeSwgZ3JvdXAgPSBDbHVzdGVyLCBmaWxsID0gQ2x1c3RlciksIGFscGhhID0gLjUpICsKICBjb29yZF9lcXVhbCgpICsKICBsYWJzKHRpdGxlID0gIkhvdyBrLW1lYW5zIHdvcms6IikgKwogIHRoZW1lX2FuaSAtPiBnZwpnZ2FuaW1hdGUoZ2cpCmBgYApgYGB7ciBmaWcuc2hvdyA9ICdhbmltYXRlJ30KZ2dwbG90KGRmX2xvbmcsIGFlcyh4ID0gcm93aWQsIHkgPSBFdV9kaXN0LCBmcmFtZSA9IEl0ZXJhdGlvbiwgY3VtdWxhdGl2ZSA9IFRSVUUpKSArCiAgZ2VvbV9wb2ludChhZXMoY29sb3IgPSBJdGVyYXRpb24pLCBzaXplID0gNSkgKwogIGdlb21fbGluZShhZXMoY29sb3IgPSBJdGVyYXRpb24pLCBzaXplID0gMSkgKwogIHNjYWxlX3hfY29udGludW91cyhicmVha3MgPSAxOmRpbShkZl94eSlbMV0pICsKICBsYWJzKHggPSAiT2JzZXJ2YXRpb25zIiwgeSA9ICJFdWNsaWRlYW4gZGlzdGFuY2Ugc3F1YXJlZCIsCiAgICAgICB0aXRsZSA9ICJTcXVhcmVkIEV1Y2xpZGVhbiBkaXN0YW5jZSBmb3IgZWFjaCBvYnNlcnZhdGlvbiBvbjoiKSArCiAgdGhlbWVfYW5pIC0+IGdnX2Rpc3QKCmdnYW5pbWF0ZShnZ19kaXN0KQpgYGAKYGBge3IgZmlnLnNob3cgPSAnYW5pbWF0ZSd9CmRmX2xvbmcgJT4lIAogIGdyb3VwX2J5KEl0ZXJhdGlvbikgJT4lIAogIHN1bW1hcmlzZSh0b3Rfc3VtID0gc3VtKEV1X2Rpc3QpKSAlPiUgCiAgdW5ncm91cCgpICU+JSAKICBnZ3Bsb3QoYWVzKHggPSBJdGVyYXRpb24sIHkgPSB0b3Rfc3VtLCBmcmFtZSA9IEl0ZXJhdGlvbiwgY3VtdWxhdGl2ZSA9IFRSVUUpKSArCiAgICBnZW9tX2JhcihzdGF0ID0gJ2lkZW50aXR5JywgYWVzKGZpbGwgPSBJdGVyYXRpb24pKSArCiAgICBsYWJzKHggPSBOVUxMLCB0aXRsZSA9ICJUb3RhbCB3aXRoaW4tY2x1c3RlciBzdW0gb2Ygc3F1YXJlIG9uOiIpICsKICAgIHRoZW1lX2FuaSAtPiBnZ190b3Rfc3VtCgpnZ2FuaW1hdGUoZ2dfdG90X3N1bSkKYGBgCgojIyMjICoqQ29tbWVudCMxKiogIAoKQnkgdGhlIGVuZCBJIHdhbnQgdG8gbWVudGlvbiBzb21lIHRoaW5ncy4gICAKVGhpcyBpcyBhIGRlbW9uc3RyYXRpb24gZXhhbXBsZSwgc28gdGhlIGstbWVhbnMgYWxnb3JpdGhtIGltcGxlbWVudGVkIGhlcmUgbWF5IGJlIGZhciBmcm9tIGVmZmVjdGl2ZS4gQmVjYXVzZSBvZiB0aGlzLCBJIGRvIG5vdCByZWNvbW1lbmQgdXNpbmcgaXQgd2l0aCBsYXJnZSBkYXRhc2V0cyBhbmQgYSBsYXJnZSBudW1iZXIgb2YgaXRlcmF0aW9ucyAoPiAxMCkuIE9mIGNvdXJzZSwgeW91IGNhbiB1c2UgeW91ciBvd24gZGF0YXNldCBpZiB5b3UgYXJlIGludGVyZXN0ZWQuICAgCgojIyMjICoqQ29tbWVudCMyKiogICAKClNvLCB3aGF0IGhhdmUgSSBhY2hpZXZlZCBieSByZWludmVudGluZyB0aGUgd2hlZWw6ICAgCjEuICBVbmRlcnN0b29kIHRoZSBbay1tZWFuc10oaHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnL3dpa2kvSy1tZWFuc19jbHVzdGVyaW5nKSBhbGdvcml0aG0sIG9mIGNvdXJzZSkgICAKMi4gIEltcHJvdmVkIG15IHNraWxscyBpbiBbdHlkaXZlcnNlXShodHRwczovL3d3dy50aWR5dmVyc2Uub3JnLykgICAKMy4gIExlYXJuZWQgdG8gd3JpdGUgcmVjdXJzaXZlIGZ1bmN0aW9ucyB0aGF0IHJldHVybiB0d28gdmFsdWVzLCBvbmUgb2Ygd2hpY2ggaXMgbmVjZXNzYXJ5IGZvciByZWN1cnNpb24gd29yaywgYW5kIHRoZSBvdGhlciAtIGZvciBleHRlcm5hbCBuZWVkcyAodGhpcyBpcyB3aGF0IHdyYXBlcl9rbWVhbnMoKSByZXR1cm4pLiBUaGlzIHdhcyBhY2hpZXZlZCBieSB0aGUgYWRkaXRpb25hbCB3cmFwcGVyIGZ1bmN0aW9uIGFuZCB0aGUgY29uY2VwdCBvZiBlbnZpcm9ubWVudHMuICAKNC4gIExlYXJuZWQgdG8gbWFrZSBhbmltYXRlZCBjaGFydHMgd2l0aCBbZ2dhbmltYXRlXShodHRwczovL2dpdGh1Yi5jb20vdGhvbWFzcDg1L2dnYW5pbWF0ZSkuICAKNS4gIEV4YW1pbmVkIFtjaHVsbF0oaHR0cHM6Ly93d3cucmRvY3VtZW50YXRpb24ub3JnL3BhY2thZ2VzL2dyRGV2aWNlcy92ZXJzaW9ucy8zLjUuMS90b3BpY3MvY2h1bGwpIGZ1bmN0aW9uIGZyb20gYmFzZSBSICAgCjYuICBMZWFybmVkIGhvdyB0byBbaGlkZSBjb21tZW50c10oaHR0cHM6Ly9zdGFja292ZXJmbG93LmNvbS9xdWVzdGlvbnMvNDgyMzQ2OC9jb21tZW50cy1pbi1tYXJrZG93bikgaW4gW1JtYXJrZG93bl0oaHR0cHM6Ly9ybWFya2Rvd24ucnN0dWRpby5jb20vaW5kZXguaHRtbCkgIAogICAgCkknZCBiZSBhcHByaWNpYXRlIGZvciB5b3VyIGNyaXRpY2FsIHJlbWFya3MgYW5kIGNvbW1lbnRzLiBXZWxsLCBvZiBjb3Vyc2UgSSB3aWxsIGJlIHZlcnktdmVyeS12ZXJ5IGhhcHB5IGlmIHRoaXMgbWF0ZXJpYWwgaGVscHMgeW91IHVuZGVyc3RhbmQgay1tZWFucyBhcyBpdCBoZWxwZWQgbWUhIFlvdSBjYW4gZmluZCBjb2RlIFtoZXJlXShodHRwczovL2dpdGh1Yi5jb20vRG15dHJvUnliYWxrby9rLW1lYW5zX2V4cGxhbmF0aW9uKQo=