Div-2 Contest
Div-1 Contest

Author and Editorialist: Simon St James
Tester: Suchan Park


Medium-Hard TODO - revisit this - seems harder than MVCN2TST


Sprague-Grundy, Centroid Decomposition, Bit Manipulation, Ad-hoc


Bob and Alice are playing a game on a board which is a tree, TT. Each node of TT has some number of coins placed on it. For a node RR, define π‘”π‘Žπ‘šπ‘’(T,R)\textit{game}(T, R) to be the game played on TT in which players take turns to move a coin from some node other than RR strictly towards RR. The first player unable to make a move loses the game. For each RR, find the winner of game(T,R)game(T,R), assuming both players play perfectly.


The game π‘”π‘Žπ‘šπ‘’(T,R)\textit{game}(T, R) is equivalent to the game of Nim, where there is a one-to-one correspondence between coins on the board and piles of stones: for a coin cc, if vcv_c is the node containing cc, then cc corresponds to a Nim pile of size 𝑑𝑖𝑠𝑑(R,vc)\textit{dist}(R,v_c). Thus, the Sprague-Grundy Theorem can be used. Some simple observations show that the exact value of cvc_v is not important; only its parity: we set Vπ‘π‘œπ‘–π‘›V_{\textit{coin}} to be the set of nodes vv such that cvc_v is odd.


R.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦=⨁v∈Vπ‘π‘œπ‘–π‘›π‘‘π‘–π‘ π‘‘(R,v) R.\textit{grundy}=\bigoplus_{v\in V_{\textit{coin}}}{\textit{dist}(R,v)}

Then by the Sprague-Grundy Theorem, the second player (Bob) wins π‘”π‘Žπ‘šπ‘’(T,R)\textit{game}(T, R) if and only if R.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦=0R.\textit{grundy}=0, so we need only calculate this value for all RR.

For nodes uu and vv where v∈Vπ‘π‘œπ‘–π‘›v\in V_{\textit{coin}}, define the contribution of vv to uu as 𝑑𝑖𝑠𝑑(u,v)\textit{dist}(u,v), and the act of updating u.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦=u.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦βŠ•π‘‘π‘–π‘ π‘‘(u,v)u.\textit{grundy}=u.\textit{grundy}\oplus \textit{dist}(u,v) as propagating vv's contribution to uu.

We form a π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker} data structure with the following API:

class DistTracker
    insertDist(distance) { ... }
    addToAllDists(distance) { ... }
    grundyNumber() { ... } // Return the xor sum of all the contained distances 

In a naive implementation, at least one of these operations would be π’ͺ(N)\mathcal{O}(N), but by observing how bits in the binary representation of a number change upon incrementing it and using some properties of xor, we can implement all of π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker}'s operations in π’ͺ(logN)\mathcal{O}(\log N) or better.

We then use Centroid Decomposition plus our π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker} to collect all contributions of vv with v∈Vπ‘π‘œπ‘–π‘›v\in V_{\textit{coin}} and propagate them to all nodes RR, thus calculating all required R.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦R.\textit{grundy}.


As mentioned, this is the game of Nim in disguise: in Nim, we start with some number MM of piles, the i𝑑hi^\textit{th} of which contains pip_i stones, and players take turns to choose a non-empty pile and take at least one stone from it, until a player cannot make a move, in which case they lose. In the game π‘”π‘Žπ‘šπ‘’(T,R)\textit{game}(T, R), let vCv_C be the node containing the coin CC; then the correspondence between the two games is as follows:

The Sprague-Grundy Theorem proves several interesting statements but the one we're interested in is the remarkable (and very unintuitive!) result that in the game of Nim, the second player wins if and only if the Grundy Number for the game is 00, where the grundy number is the xor-sum of all the pile sizes i.e. p1βŠ•p2βŠ•β€¦βŠ•pMp_1 \oplus p_2 \oplus \dots \oplus p_M. Applying this to our game, we see that Bob wins if and only if the grundy number R.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦R.\textit{grundy} for the game π‘”π‘Žπ‘šπ‘’(T,R)\textit{game}(T, R) defined by:

R.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦=⨁cβˆˆπ‘π‘œπ‘–π‘›π‘ π‘‘π‘–π‘ π‘‘(R,vc) R.\textit{grundy}=\bigoplus_{c\in \textit{coins}}{\textit{dist}(R,v_c)}

is 00.

We can simplify this a little: consider two coins CAC_A and CBC_B, both on the same node vv. Their contribution to R.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦R.\textit{grundy} is 𝑑𝑖𝑠𝑑(R,v)βŠ•π‘‘π‘–π‘ π‘‘(R,v)\textit{dist}(R, v) \oplus \textit{dist}(R, v). But xβŠ•x=0x \oplus x = 0 for all xx, so we can remove both coins without affecting R.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦R.\textit{grundy}. For each vv, we can safely remove pairs of coins from vv until either 00 or 11 remain, depending on whether cvc_v was originally odd or even. We say that a node vv hπ‘Žπ‘ πΆπ‘œπ‘–π‘›\textit{hasCoin} if cvc_v is odd, and set Vπ‘π‘œπ‘–π‘›=V_\textit{coin}= the set of all such vv. We can now rephrase the formula for R.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦R.\textit{grundy}:

R.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦=⨁v∈Vπ‘π‘œπ‘–π‘›π‘‘π‘–π‘ π‘‘(R,v) R.\textit{grundy}=\bigoplus_{v \in V_{\textit{coin}} }{\textit{dist}(R,v)}

Recalling the definitions of contribution and propagation from the Quick Explanation, we see that to solve the problem, we merely need to ensure that for each v∈Vπ‘π‘œπ‘–π‘›v \in V_{\textit{coin}} and every RR, vv's contribution to RR is propagated to RR.

Let's consider for the moment the special case where TT is simply a long chain of nodes. Imagine we had a π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker} data structure with the below API (a naive implementation is also provided):

    class DistTracker
            for each distance in trackedDistances:
                distance += toAdd
            result = 0
            for each distance in trackedDistances:
                result = result ^ distance
            return result
            trackedDistances = []
        trackedDistances = []

Imagine further that we proceed along the chain of nodes from left to right performing at each node vv the following steps:

(Click image to see the animation)

This way, we collect then propagate the contribution of each v∈Vπ‘π‘œπ‘–π‘›v\in V_{\textit{coin}} to all nodes to vv's right.

Let's π‘π‘™π‘’π‘Žπ‘Ÿ()\textit{clear}() our π‘‘π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{distTracker} and repeat the process, this time working in the opposite direction:

Now we've propagated the contribution of each v∈Vπ‘π‘œπ‘–π‘›v \in V_{\textit{coin}} to all nodes to vv's right and to its left i.e. to all nodes, and so have computed all R.π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦R.\textit{grundy}, as required. It turns out that Bob wins two of the games and Alice wins the rest.

The naive implementation of π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker} given above is too slow to be of use: we'll fix this later but first, let's show how we can use Centroid Decomposition with our π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker} to collect and propagate all v∈Vcoinv\in V_{coin}'s on an arbitrary tree TT. If you're already familiar with Centroid Decomposition, you may want to skip this part.

Using Centroid Decomposition to Propagate All Contributions We won't go into much detail on Centroid Decomposition here as there are many resources about it, but here are the properties we care about:

C1: Centroid Decomposition of TT induces MM subtrees (MM is π’ͺ(N)\mathcal{O}(N)) TiT_i of TT each with a node CiC_i that is the centre of TiT_i
C2: The Ξ£i=1M|Ti|\Sigma_{i=1}^M |T_i| is π’ͺ(NlogN)\mathcal{O}(N\log N)
C3: Let u,vu,v be any distinct pair of nodes, and let P(u,v)=[u=p0,p1,…,pk=v]P(u,v)=[u=p_0, p_1, \ldots, p_k=v] be the unique path between uu and vv. Then there is precisely one ii such that uu and vv are in subtree TiT_i and ci∈P(u,v)c_i \in P(u,v)

Let DiD_i be the degree of CiC_i in TiT_i, and let b1,b2,…,bDib_1, b_2, \dots, b_{D_i} be the neighbours of CiC_i in TiT_i. We partition the u∈Tiu\in T_i, uβ‰ Ciu \ne C_i into DiD_i branches, where the node uu is in branch ll if the unique path from CiC_i to uu passes through blb_l. For example:

TODO - image here - medium size TiT_i with Di=4D_i = 4 - MOVCOIN2_ED_3_THUMB.png, linking to MOVCOIN2_ED_3_ANIM.gif. In the meantime, you can probably figure it out from anims 6 & 7 XD

With this notation, C3 can be rephrased as:

C3: Let u,vu,v be any distinct pair of nodes; then there is precisely one ii such that uu and vv are in subtree TiT_i and either:

  1. Ci=uC_i=u; or
  2. Ci=vC_i=v; or
  3. uu and vv are in different branches of TiT_i

from which it follows that doing the following for every i=1,2,…,Mi=1,2,\ldots, M:

  1. for each j=1,2,…,Dij=1,2,\dots,D_i, propagate the contributions of all v∈Vπ‘π‘œπ‘–π‘›βˆ©Bjv \in V_\textit{coin}\cap B_j to the nodes in the other Diβˆ’1D_i-1 branches of TiT_i; and
  2. propagate the contributions of all v∈Vπ‘π‘œπ‘–π‘›βˆ©Tiv \in V_\textit{coin}\cap T_i to CiC_i; and
  3. (if Ci.hπ‘Žπ‘ πΆπ‘œπ‘–π‘›C_i.\textit{hasCoin}) propagate the contribution of CiC_i to all other nodes in TiT_i

will propagate the contributions of all v∈Vπ‘π‘œπ‘–π‘›v \in V_{\textit{coin}} to all u∈Tu \in T, as required.

Both 1. and 2. can be done separately using a naive algorithm (although my implementation rolls them into 3.). 3. can be handled in a similar way to the "propagate-and-collect-and-then-in-reverse" approach from earlier, except now we are collecting and propagating branches at a time, rather than nodes.

For each i=1,2,…,Mi=1,2,\dots,M, create a fresh π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker} and perform the following steps:

  1. Propagate the contributions to nodes in branch ii; that is, do a DFS from bib_i, calling π‘Žπ‘‘π‘‘π‘‡π‘œπ΄π‘™π‘™π·π‘–π‘ π‘‘π‘ (1)\textit{addToAllDists}(1) when we visit a node for the first time, and π‘Žπ‘‘π‘‘π‘‡π‘œπ΄π‘™π‘™π·π‘–π‘ π‘‘π‘ (βˆ’1)\textit{addToAllDists}(-1) when we have fully explored it
  2. Collect the contributions of nodes in branch ii; that is, do a DFS from bib_i, calling π‘–π‘›π‘ π‘’π‘Ÿπ‘‘π·π‘–π‘ π‘‘(d)\textit{insertDist}(d) when we encounter a node in Vπ‘π‘œπ‘–π‘›V_{\textit{coin}} at distance dd from CiC_i.

A BFS would also work and would likely be slightly more efficient: here's an example:

Then we π‘π‘™π‘’π‘Žπ‘Ÿ()\textit{clear}() our π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker} and repeat, this time with i=M,Mβˆ’1,…,2,1i=M,M-1,\dots,2,1.

TODO - animation - MOVCOIN2_ED_5_THUMB.png, linking to MOVCOIN2_ED_5_ANIM.gif

We now return to optimising our π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker}. It often helps to take a bitwise approach to problems involving xor sums, and this is the case here. Let's have a look at the binary representation of an increasing series of numbers and observe how each bit changes. The numbers along the top of the table are the bit number with bit number 00 being the least significant bit.

N 5 4 3 2 1 0
0 0 0 0 0 0 0
1 0 0 0 0 0 1
2 0 0 0 0 1 0
3 0 0 0 0 1 1
4 0 0 0 1 0 0
5 0 0 0 1 0 1
6 0 0 0 1 1 0
7 0 0 0 1 1 1
8 0 0 1 0 0 0

The pattern is clear: the xthx^{\text{th}} bit is 00 2x2^x times in a row, then 11 2x2^x times in a row, and continues flipping every 2x2^x increments. We can exploit this pattern in our π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker} to maintain a count of the number of tracked distances that have their xthx^{\text{th}} bit set: the animation below illustrates this approach with the original example. Here:

Note that the xthx^{\text{th}} bit of the grundy number is set if and only if the number of tracked distances with their xthx^{\text{th}} bit set is odd, so pairs of distances with their xthx^{\text{th}} bit set contribute nothing to the grundy number and so are crossed out. If we know which bits of the grundy number are set, computing the number itself is trivial.

Note that the fourth row of the grid is omitted: since the graph only has 8 nodes, the max distance between two nodes is seven, and so a tracked distance can never enter the red-one-zone for a fourth row and change the grundy number. Similar logic is used to reduce π‘š_π‘›π‘’π‘šπ΅π‘–π‘‘π‘ \textit{m\_numBits} in the π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker} implementation. In general, the number of bits/ rows is π’ͺ(log(π‘šπ‘Žπ‘₯ π‘‘π‘–π‘ π‘‘π‘Žπ‘›π‘π‘’ 𝑏𝑒𝑑𝑀𝑒𝑒𝑛 π‘›π‘œπ‘‘π‘’π‘ ))\mathcal{O}(\log (\textit{max distance between nodes})).

With this new π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker}, the computation of the grundy number (the xor of all the tracked distances) has been rolled into π‘Žπ‘‘π‘‘π‘‡π‘œπ΄π‘™π‘™π·π‘–π‘ π‘‘π‘ ()\textit{addToAllDists()}, so π‘”π‘Ÿπ‘’π‘›π‘‘π‘¦π‘π‘’π‘šπ‘π‘’π‘Ÿ()\textit{grundyNumber}() has been reduced from π’ͺ(N)\mathcal{O}(N) to π’ͺ(1)\mathcal{O}(1), a substantial improvement; however, π‘Žπ‘‘π‘‘π‘‡π‘œπ΄π‘™π‘™π·π‘–π‘ π‘‘π‘ ()\textit{addToAllDists}() remains π’ͺ(N)\mathcal{O}(N) as it must still move all coins on each call, so we don't appear to have gained much.

However, what if on each call to π‘Žπ‘‘π‘‘π‘‡π‘œπ΄π‘™π‘™π·π‘–π‘ π‘‘π‘ (1)\textit{addToAllDists}(1), for each xx, instead of moving all coins on row xx one cell to the right on and tracking whether they enter the red-one-zone, we scrolled the red-one-zone on that row by one to the left and counted how many coins its hits or leave? Since the number of rows is π’ͺ(logN)\mathcal{O}(\log N), π‘Žπ‘‘π‘‘π‘‡π‘œπ΄π‘™π‘™π·π‘–π‘ π‘‘π‘ (1)\textit{addToAllDists}(1) is now π’ͺ(logN)\mathcal{O}(\log N), so all operations on π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker} are now π’ͺ(logN)\mathcal{O}(\log N) in the worst case.

And that's it!

TODO - end less lamely. Should maybe mention somewhere that reducing m_numBits gives asymptotic gains i.e. without it, creating a π·π‘–π‘ π‘‘π‘‡π‘Ÿπ‘Žπ‘π‘˜π‘’π‘Ÿ\textit{DistTracker} for each Centroid TiT_i would contribute O(N2)O(N^2) to the runtime

Complexity Analysis


A Faster Alternative to Centroid Decomposition? When I first solved this Problem (and CHGORAM2), I didn't know much about the properties of Centroid Decomposition and so came up with my own approach which I called the light-first DFS:

TODO - add publically available link to

This runs quite a bit faster than my Centroid Decomposition based solution. Questions for the interested reader :)

The fastest solution that I saw was @sg1729's, running in just 0.63s.


Setter's Solution (C++)

Tester's Solution (Kotlin)

package MOVCOIN2

class Movcoin2Solver(private val N: Int, private val gph: List<List<Int>>, private val hasToken: List<Boolean>) {
    private class XorOfElementPlusConstant (elements: List<Int>, val constantMax: Int) {
        private val MAX_BITS = 17

        val xored = IntArray(constantMax+1)

        init {
            for(b in 0 until MAX_BITS) {
                var l = (1 shl b)
                var r = (1 shl (b+1)) - 1

                var cnt = 0

                val freq = IntArray(constantMax+1)
                for(it in elements) {
                    val target = it and ((1 shl (b+1)) - 1)
                    freq[target] = (freq[target] ?: 0) + 1
                    if (target in l..r) cnt++

                for (d in 0..constantMax) {
                    if (cnt % 2 == 1) xored[d] += 1 shl b

                    cnt -= freq.getOrElse(r) { 0 }

                    if(l < 0) l = (1 shl (b+1)) - 1
                    if(r < 0) r = (1 shl (b+1)) - 1

                    cnt += freq.getOrElse(l) { 0 }

        fun getXorOfElementsPlus(constant: Int) = xored[constant]

    private val marked = BooleanArray(N)

    private val subtreeSize = IntArray(N)
    private val getMaxSubtreeSizeWhenUIsRemoved = IntArray(N)

    private fun getCentroidInComponentOf(root: Int): Int {
        val queue: java.util.Queue<Pair<Int,Int>> = java.util.ArrayDeque<Pair<Int,Int>>()
        val order = mutableListOf<Pair<Int,Int>>()

        queue.add(Pair(root, -1))
        while(!queue.isEmpty()) {
            val (u, p) = queue.poll()
            subtreeSize[u] = 1
            getMaxSubtreeSizeWhenUIsRemoved[u] = 0
            for(v in gph[u]) if(!marked[v] && v != p) queue.add(Pair(v, u))


        for((u, p) in order) {
            if(p >= 0) subtreeSize[p] += subtreeSize[u]

        val numNodes = subtreeSize[root]

        for((u, p) in order) {
            getMaxSubtreeSizeWhenUIsRemoved[u] = maxOf(getMaxSubtreeSizeWhenUIsRemoved[u], numNodes - subtreeSize[u])
            if (p >= 0) {
                getMaxSubtreeSizeWhenUIsRemoved[p] = maxOf(getMaxSubtreeSizeWhenUIsRemoved[p], subtreeSize[u])
            if (getMaxSubtreeSizeWhenUIsRemoved[u] <= numNodes / 2) {
                return u

        return -1

    private fun getGrundys(): List<Int> {
        val grundy = IntArray(N)

        fun process(root: Int, initialD: Int) {
            val order = mutableListOf<Pair<Int,Int>>()

            val queue: java.util.Queue<Triple<Int,Int,Int>> = java.util.ArrayDeque<Triple<Int,Int,Int>>()

            queue.add(Triple(root, -1, initialD))
            while(!queue.isEmpty()) {
                val (u, p, d) = queue.poll()
                order.add(Pair(u, d))
                for(v in gph[u]) if(!marked[v] && v != p) queue.add(Triple(v, u, d+1))

            val distances = mutableListOf<Int>()
            for((u, d) in order) if(hasToken[u]) distances.add(d)

            val maxDistance = order.maxBy(Pair<Int,Int>::second)!!.second

            val ds = XorOfElementPlusConstant(distances, maxDistance)
            for((u, d) in order) grundy[u] = grundy[u] xor ds.getXorOfElementsPlus(d)

        val queue: java.util.Queue<Int> = java.util.ArrayDeque<Int>()

        process(0, 1)

        while(!queue.isEmpty()) {
            val q = queue.poll()

            process(q, 1)

            val u = getCentroidInComponentOf(q)
            marked[u] = true
            process(u, 0)

            for(v in gph[u]) if(!marked[v]) queue.add(v)

        return grundy.toList()

    private fun getAnswer(grundy: List<Int>): Long {
        val MOD = 1000000007
        var pow2 = 1
        var ans = 0L
        for(value in grundy) {
            pow2 *= 2
            pow2 %= MOD
            if(value == 0) ans += pow2
        return ans % MOD

    fun run() = getAnswer(getGrundys())

class Movcoin2Connector(private val br:, private val bw: {
    var sumN = 0

    fun checkConstraints() {
        require(sumN <= 200000)

    fun run() {
        val N = br.readLine()!!.toInt()
        require(N in 1..200000)

        val grp = IntArray(N) { it }
        fun getGroup(x: Int): Int{
            val parents = generateSequence(x, { grp[it] }).takeWhile { grp[it] != it }
            val r = grp[parents.lastOrNull() ?: x]
            parents.forEach{ grp[it] = r }
            return r

        val gph = List(N) { mutableListOf<Int>() }
        repeat(N-1) {
            val (a, b) = br.readLine()!!.split(' ').map{ it.toInt() - 1 }

            val p = getGroup(a)
            val q = getGroup(b)
            require(p != q)
            grp[p] = q

        val C = br.readLine()!!.split(' ').map(String::toInt)
        require(C.all{ it in 0..16 })

        val solver = Movcoin2Solver(N, gph,{ it % 2 == 1 })

fun main (args: Array<String>) {
    val br =`in`))
    val bw =`out`))

    val T = br.readLine()!!.toInt()
    require(T in 1..1000)

    val connector = Movcoin2Connector(br, bw)
    repeat(T) {

    require(br.readLine() == null)