Managing Chaincode Data Models

28 min readFeb 22, 2024

Introduction

Normally, all chaincode functions (smart contracts) require either reading and/or writing to the blockchain ledger. Even though the ledger can receive/return data using an unstructured binary byte array, the most common way to read/write from/to the ledger is using JSON with a structured data model.

But what happens when the data model changes? How does Hyperledger Fabric (HLF) behave and how should our chaincode sources deal with these data model updates?

This article deals with these chaincode data model matters.

Framing the problem

Let’s imagine we create a chaincode that has 3 public functions (contracts):

SavePerson: receives data for a person and saves the person to the ledger
GetPerson: receives a person identifier and returns all the data stored in the ledger for that person
GetPersonHistory: receives a person identifier and returns all versions of the person (from the first time it was stored, all times it has been updated until the current version of the stored person in the blockchain).

With these 3 functions in mind, let’s imagine how the person data model could look like (all examples are written in GO but the same logic would apply for Typescript or Java):

package models 

type Person struct {
    Name        string   `json:"name"`
    Age         uint     `json:"age"`
}

For the sake of organization, we will call this initial version v1.

Problem 1: The model is extended

Now let’s imagine we deploy this chaincode in a running network and we start storing persons. Everything would work nicely for some time, but after a few months, we have a new business requirement that requests that we add the surname to the persons, so now the model will be updated with the following:

package models 

type Person struct {
    Name        string   `json:"name"`
    Surname     string   `json:"surname"`
    Age         uint     `json:"age"`
}

For the sake of organization we will call this new version v2.

Ok, any new user will definitely include the surname, but there are a couple of questions that we need to answer:

What happens with all previously stored users? => Will the surname field be empty when we get a person that was stored in the ledger using the v1 version of the chaincode?
How does HLF and our chaincode deal with this situation?
Should we update all “old” users to include the new surname? => in this case, should we have included a version attribute to identify all v1 users?

Problem 2: The data model is changed

Ok, let’s also imagine we have already answered all pending questions for Problem 1 and we are happily ever after with our new chaincode version (v2), and after a few months, we have a new business requirement: our business manager requires that the Age can be expressed as any number followed by a time unit to indicate the unit in which we are expressing the number (being d for days, m for months and y for years). So now, our users could include age expressed as a string (so a 20 year old person’s age could be expressed as “87600 d” or “240 m” or “20 y”). Let’s also imagine that our business manager is quite strict on the requirement and doesn’t want us to extend the model with new attributes for this change. So the only alternative for this change is to update the model like so:

package models 

type Person struct {
    Name        string   `json:"name"`
    Surname     string   `json:"surname"`
    Age         string   `json:"age"`
}

A new set of questions arise here:

What happens with all previously stored users (remember that in v1 and v2, the Age parameter was an unsigned integer, and now is a string).
How does HLF and our chaincode deal with this situation?

Initial conclusions

In this article we will see a practical approach to these 2 problems with real examples.

Dealing with the problem

After framing the problem, now it’s time to deal with it.

In the following, we will show and discuss code samples as well executions of that code. The execution is done through an API, which we call “Oxia” for the sake of an example.

Simple implementation of our example

We have prepared a simple implementation of our example to see what would happen if we went ahead with the original design (direct example from the previous chapter). The chaincode implements 3 simple functions.

Implementing and testing V1

Let’s see the code for V1:

type Person struct {
 Name string `json:"name" validate:"required"`
 Age  uint   `json:"age" validate:"required"`
}

// StorePerson stores a person in the ledger
func (c *TsContract) StorePerson(
 ctx contractapi.TransactionContextInterface,
 name string,
 age uint,
) (result Person, err error) {
 result.Age = age
 result.Name = name

 serializedPerson, encodeError := json.Marshal(result)
 if encodeError != nil {
  err = encodeError
 }

 if internalError := ctx.GetStub().PutState(name, []byte(serializedPerson)); internalError != nil {
  err = errors.New("Failed to store data: " + err.Error())
 }

 return
}

// LoadPerson loads a person from the ledger
func (c *TsContract) LoadPerson(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result Person, err error) {
 personBytes, err := ctx.GetStub().GetState(name)

 if err != nil {
  return
 }

 if personBytes == nil {
  err = errors.New(fmt.Sprintf("failed to retrieve the person %s from the world state", name))
  return
 }

 err = json.Unmarshal(personBytes, &result)
 if err != nil {
  return
 }

 return
}

// PersonHistory loads a person history from the ledger
func (c *TsContract) PersonHistory(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result []Person, err error) {
 personHistoryIterator, err := ctx.GetStub().GetHistoryForKey(name)
 if err != nil {
  return
 }

 defer personHistoryIterator.Close()

 for personHistoryIterator.HasNext() {
  personAsKeyValue, iteratorErr := personHistoryIterator.Next()
  if err != nil {
   return result, iteratorErr
  }

  if personAsKeyValue.IsDelete {
   continue
  }

  var person *Person
  if err = json.Unmarshal(personAsKeyValue.Value, &person); err != nil {
   return
  }

  result = append(result, *person)
 }
 return
}

Let’s see what happens when we execute this code.

We begin executing the StorePerson function:

Nice! We stored a new person called “peter” with age 20. Now let’s retrieve him:

Ok, this also worked. Peter is definitely 20 years old. Now a year has passed, so let’s update his age to 21.

Happy birthday Peter!!, this also worked. Let’s just double check and retrieve Peter again.

Ok, Peter has definitely a new age. Now let’s retrieve Peter’s transaction history in our blockchain:

Good, we seem to be on the good track: Peter was stored initially with 20 years, and after his birthday, he is now 21.

Implementing and testing V2

Ok, we arrived at the point where the business asks us to include a surname to our model, let’s see what happens with Peter in this scenario. For the sake of transparency, here is our V2 chaincode source code.

type Person struct {
 Name    string `json:"name" validate:"required"`
 Age     uint   `json:"age" validate:"required"`
 Surname string `json:"surname"`
}

// StorePerson stores a person in the ledger
func (c *TsContract) StorePerson(
 ctx contractapi.TransactionContextInterface,
 name string,
 age uint,
 surname string,
) (result Person, err error) {
 result.Age = age
 result.Name = name
 result.Surname = surname

 serializedPerson, encodeError := json.Marshal(result)
 if encodeError != nil {
  err = encodeError
 }

 if internalError := ctx.GetStub().PutState(name, []byte(serializedPerson)); internalError != nil {
  err = errors.New("Failed to store data: " + err.Error())
 }

 return
}

// LoadPerson loads a person from the ledger
func (c *TsContract) LoadPerson(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result Person, err error) {
 personBytes, err := ctx.GetStub().GetState(name)

 if err != nil {
  return
 }

 if personBytes == nil {
  err = errors.New(fmt.Sprintf("failed to retrieve the person %s from the world state", name))
  return
 }

 err = json.Unmarshal(personBytes, &result)
 if err != nil {
  return
 }

 return
}

// PersonHistory loads a person history from the ledger
func (c *TsContract) PersonHistory(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result []Person, err error) {
 personHistoryIterator, err := ctx.GetStub().GetHistoryForKey(name)
 if err != nil {
  return
 }

 defer personHistoryIterator.Close()

 for personHistoryIterator.HasNext() {
  personAsKeyValue, iteratorErr := personHistoryIterator.Next()

  if err != nil {
   return result, iteratorErr
  }

  if personAsKeyValue.IsDelete {
   continue
  }

  var person *Person
  if err = json.Unmarshal(personAsKeyValue.Value, &person); err != nil {
   return
  }

  result = append(result, *person)
 }
 return
}

Now we will also bring a new player into the scene (“Ann Smith” aged 35). Let’s begin storing Ann to the blockchain.

Good, now Ann Smith is registered in our contract. Let’s see what happens if we retrieve her and her history:

Ok, she is definitely there.

Yep, her transaction history seems to be just fine.

Now, what happened to Peter? Let’s check what we have if we retrieve him directly.

Ok, Peter is still alive and aged 21, although he does not have a Surname. Let’s see his transaction history:

Ooookay… Peter’s history seems quite fine, although his surname (“Woopdiwoop”) is not registered. Let’s add his surname.

Ok, this seems to work. Let’s check on Peter again.

Nice, now Peter, according to the current world-state, has his surname registered. Let’s check his transaction history once again:

Ok, this looks good. Peter didn’t have a surname, but now he has. The initial versions of Peter do include his surname, although they are nowhere to be found, but his last version does have a nice surname.

Checkpoint and conclusions of V2 implementation

Let’s recap what we have learned.

The blockchain is definitely ok with data model extensions.
Our chaincode implementation seems to work just fine when extending a model.

This looks good: nothing is broken and we just had to add the new attribute to Peter to make things work. There is a fundamental problem that we are not dealing with, and it’s the fact of an application having 1000MM users. How would we know which users need a surname in this case? It’s difficult to answer and there are couple of strategies for this, we will see how can we deal with these situations further along the article. But first let’s implement our “hardcore” change (moving from V2 to V3 may be a dangerous path… let’s see what happens). Another problem is that it is not straightforward to apply such an “update” in any kind of chaincode. Imagine a chaincode that implements some special conditions (e.g. balance must be high enough) for making updates, it wouldn’t be easily possible to do this for every state unless we have a way to track the version of the model used to store the data in the first place (these kind of problems are usually solved with blockchain forks).

Implementing and testing V3

Ok, first let’s take a look at how our chaincode implementation looks like with V3:

type Person struct {
 Name    string `json:"name" validate:"required"`
 Age     string `json:"age" validate:"required"`
 Surname string `json:"surname"`
}

// StorePerson stores a person in the ledger
func (c *TsContract) StorePerson(
 ctx contractapi.TransactionContextInterface,
 name string,
 age string,
 surname string,
) (result Person, err error) {
 result.Age = age
 result.Name = name
 result.Surname = surname

 serializedPerson, encodeError := json.Marshal(result)
 if encodeError != nil {
  err = encodeError
 }

 if internalError := ctx.GetStub().PutState(name, []byte(serializedPerson)); internalError != nil {
  err = errors.New("Failed to store data: " + err.Error())
 }

 return
}

// LoadPerson loads a person from the ledger
func (c *TsContract) LoadPerson(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result Person, err error) {
 personBytes, err := ctx.GetStub().GetState(name)

 if err != nil {
  return
 }

 if personBytes == nil {
  err = errors.New(fmt.Sprintf("failed to retrieve the person %s from the world state", name))
  return
 }

 err = json.Unmarshal(personBytes, &result)
 if err != nil {
  return
 }

 return
}

// LoadPerson loads a person from the ledger
func (c *TsContract) PersonHistory(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result []Person, err error) {
 personHistoryIterator, err := ctx.GetStub().GetHistoryForKey(name)
 if err != nil {
  return
 }

 defer personHistoryIterator.Close()

 for personHistoryIterator.HasNext() {
  personAsKeyValue, iteratorErr := personHistoryIterator.Next()

  if err != nil {
   return result, iteratorErr
  }

  if personAsKeyValue.IsDelete {
   continue
  }

  var person *Person
  if err = json.Unmarshal(personAsKeyValue.Value, &person); err != nil {
   return
  }

  result = append(result, *person)
 }
 return
}

Now let’s try to retrieve Peter under these circumstances.

Ok, our contract implementation is suffering a lot. We can’t unmarshall Peter now, and everything seems broken.

Let’s try with his history, let’s hope we have better luck this time:

Ok, things are not looking good for us. JSON won’t unmarshall, and that’s because of the following:

Data stored in the ledger is represented as an array of bytes.
Our original byte array stated that the Age attribute was an unsigned integer, but now we want to have it automagically converted into a String, and this I’m afraid is not possible.

Checkpoint and conclusions of V3 implementation

We definitely arrived at a no-go situation. No matter how we try to make this work, it won’t be possible to directly convert our old uint Age attribute to become a string. The design has flaws, and now we suffer. The only way forward, given we arrived at this point with our “poor” chaincode design is the following:

Extend the model to include a new string attribute named AgeInString (for example).
Implement a migration function that will read all currently stored persons in the blockchain retrieving their Age and storing the new value in the new attribute. Then override the current key of the record including the information of the new attribute.
Retain (or discard) the “old” Age attribute in your data model (do as you like, but you won’t get away with having an attribute named as the old one with a new type).
Talk with your business owner and make him understand that, no matter how deep he is interested in using the old Age attribute to store the new string data, this is just NOT possible using blockchain technologies, where data is immutable (transactions are retained for ever after and facts that have happened and have been already recorded can’t be changed).

A data migration strategy to the rescue

In any software development, database migrations are one of the main strategies/techniques to deal with an evolving data model. Unfortunately, for blockchain developers, there is not a straight forward way to deal with data model migrations, why? Because one of the main characteristics of a Blockchain data structure is data immutability: whilst the world state will reflect the latest value of a given record (key, value), all previous value iterations of that same record (key) will remain immutable and will be stored for eternity with the original data model that was used to write the value to the blockchain.

Are database migrations a normal blockchain pattern? Not quite as these are not so easy to implement using blockchain technologies (there are written academical papers revolving around this idea as you can see in the References section of this article).

In other words, we can’t benefit from the usage of data migrations as we do in any normal database, as we will still need to deal with previous versions of the same key (for example when retrieving the full history of a given key).

But, indeed we should be able to take advantage on having our own data migration strategy: Here are some ideas that could help implement a data migration approach:

Include version numbers in your data models: Following the examples from the previous point, we could have named the models PersonV1, PersonV2, PersonV3. Now the problem will be to know exactly which version to use when unmarshalling an instance of a given person (as we won’t initially know the version in which that person was written to the blockchain at a specific moment in time), but we still can add more things to the data migration implementation.
Use the version as part of the composite key that identifies a person: In Hyperledger Fabric, a key identifying a specific record can be composed by multiple components (that’s why keys are normally called “composite keys”: because you can compose them using multiple strings). With this idea in mind, we could store persons in the blockchain using a reference to the version of the person. So, instead of saving the key “Peter” (if we use just the name to identify a record… which, by the way, is a really bad idea as most likely 2 persons in our system will have the same name, but let’s keep it simple for educational purposes) we would use “V1#Peter” (to store Peter using the first model of our chaincode), “V2#Peter” (once we update the Peter value to comply with version2) and “V3#Peter” consequently. This way, we would be able to retrieve all our persons stored in each version and we would be able to know which specific model to use when unmarshalling the byte array to an object instance in the chaincode. To retrieve all persons in a specific version, we would be able to use any of the following functions provided in the HLFs Shim Stub API: GetStateByPartialCompositeKey and GetStateByPartialCompositeKeyWithPagination (more info on these functions here). Both of these functions return a StateQueryIteratorInterface (more info here) which we could iterate over and do many things, such as getting all persons in V1 and storing a new version of the person in V2 or V3. Another thing we can do with StateQueryIteratorInterface would be to retrieve a specific person and know exactly which version was used to store this person. How? When iterating over a StateQueryIteratorInterface as a result of each iteration we will get a queryresult.KV (queryresult package — github.com/hyperledger/fabric-protos-go/ledger/queryresult — Go Packages) which will include the full key of the record, hence we should be able to observe the key and infer the version in which the person was stored.
As for data migrations, we could use the technique described in the previous point (using GetStateByPartialCompositeKey and GetStateByPartialCompositeKeyWithPagination) implementing an UpdatePersonToVX function that will retrieve all records of a specific version and that will update the records to the latest version. The data migration function would need to deal with attribute type conversions or with filling in dummy data to fulfill new attributes in the model. With this idea in mind, one thing that will happen is that the same person (for example “Peter”), will be stored with each version, so we will also need our chaincode to deal with ways of retrieving the last version of the record when trying to read the current value (stored in the world state).

So definitely, thinking about our data models and preparing our code for future changes is a good idea. In an ideal world, we could also simplify things by just making our data models robust enough such that they can withstand updates without a migration strategy, but developers should always consider future data migrations from day one (not doing so might really become a nightmare in the future).

Let’s see how this new implementation works.

Implementing V1 using our initial a data migration strategy

Here is our chaincode in v1.

// V1

import (
 "encoding/json"
 "errors"
 "fmt"
 "strings"
 "test-chaincode/capabilities/models"

 "github.com/hyperledger/fabric-chaincode-go/pkg/cid"
 "github.com/hyperledger/fabric-contract-api-go/contractapi"
 "github.com/hyperledger/fabric/common/flogging"
)

// TsContract provides the public test contract implementation
type TsContract struct{}

// Ping returns {ping: 'pong'}
func (c *TsContract) Ping() string {
 return "{\"ping\": \"pong\"}"
}

type Person struct {
 Name string `json:"name" validate:"required"`
 Age  uint   `json:"age" validate:"required"`
}

type PersonV1 struct {
 Name string `json:"name" validate:"required"`
 Age  uint   `json:"age" validate:"required"`
}

var logger = flogging.MustGetLogger("test_contract")

// CreateCompositeKey creates a versioned composite key that identifies persons with a specific version
func (c *TsContract) CreateVersionedCompositeKey(
 ctx contractapi.TransactionContextInterface,
 name string,
 version string,
) (compositekey string, err error) {
 attributes := []string{version, name}
 compositekey, err = ctx.GetStub().CreateCompositeKey("Person", attributes)
 if err != nil {
  return
 }

 return
}

// StorePerson stores a person in the ledger
func (c *TsContract) StorePerson(
 ctx contractapi.TransactionContextInterface,
 name string,
 age uint,
) (result Person, err error) {
 result.Age = age
 result.Name = name

 serializedPerson, encodeError := json.Marshal(result)
 if encodeError != nil {
  err = encodeError
 }

 key, err := c.CreateVersionedCompositeKey(ctx, name, "V1")

 if internalError := ctx.GetStub().PutState(key, []byte(serializedPerson)); internalError != nil {
  err = errors.New("Failed to store data: " + err.Error())
 }

 return
}

// LoadPerson loads a person from the ledger
func (c *TsContract) LoadPerson(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result Person, err error) {
 personQueryAsBytes := fmt.Sprintf(`{"selector":{"name":"%s"}}`, name)
 personIterator, err := ctx.GetStub().GetQueryResult(personQueryAsBytes)
 if err != nil {
  return
 }

 defer personIterator.Close()

 var person Person
 for personIterator.HasNext() {
  personAsKeyValue, iteratorErr := personIterator.Next()

  if iteratorErr != nil {
   return result, iteratorErr
  }

  if strings.Contains(personAsKeyValue.Key, "V1") {
   var personv1 PersonV1
   if err = json.Unmarshal(personAsKeyValue.Value, &personv1); err != nil {
    return
   }
        person.Name = personv1.Name
   person.Age = personv1.Age

  } else {
   var personv1 PersonV1
   if err = json.Unmarshal(personAsKeyValue.Value, &personv1); err != nil {
    return
   }

   person.Name = personv1.Name
   person.Age = personv1.Age
  }
 }

 result = person

 return
}

// UpdatePersonToLatestVersion updates a person to the latest chaincode data model
func (c *TsContract) UpdatePersonToLatestVersion(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result Person, err error) {
 personQueryAsBytes := fmt.Sprintf(`{"selector":{"name":"%s"}}`, name)
 personIterator, err := ctx.GetStub().GetQueryResult(personQueryAsBytes)
 if err != nil {
  return
 }

 defer personIterator.Close()

 var person *Person
 for personIterator.HasNext() {
  personAsKeyValue, iteratorErr := personIterator.Next()

  if iteratorErr != nil {
   return result, iteratorErr
  }

  var personv1 *PersonV1
  if strings.Contains(personAsKeyValue.Key, "V1") {
   if err = json.Unmarshal(personAsKeyValue.Value, &personv1); err != nil {
    return
   }

   person.Name = personv1.Name
   person.Age = personv1.Age

  } else {
   if err = json.Unmarshal(personAsKeyValue.Value, &personv1); err != nil {
    return
   }

   person.Name = personv1.Name
   person.Age = personv1.Age
  }
 }

 serializedPerson, encodeError := json.Marshal(person)
 if encodeError != nil {
  err = encodeError
 }

 key, err := c.CreateVersionedCompositeKey(ctx, name, "V1")

 if internalError := ctx.GetStub().PutState(key, []byte(serializedPerson)); internalError != nil {
  err = errors.New("Failed to store data: " + err.Error())
 }

 result = *person

 return
}

Missing something? Yes, we don’t have the PersonHistory function. Why? Long story short: we are using versions within the keys and the GetHistoryForKey function accepts a single key and not a partial key. When loading a person, we know the name but we don’t know the version , so we won’t be able to use this function - A short disclaimer here: I am not saying that we can’t implement such a function, but it won’t be a straight forward thing to implement. Why? Because initially we only know the name of a given person, but we won’t know the version that has been used to store that person until we query the ledger so, implementing a function to retrieve the full history of a given person, although it can be done, it will require some deep thinking and some complexity, and may have other caveats - Furthermore, a person might be stored multiple times with different keys (depending on the version the person has been stored with), so with this design we are moving away from the possibility of using GetHistoryForKey. If our business accepts this loss, then we are ok with this design. If not, then we will need to think about alternatives.

Let’s see how this works.

We will first store a person named “jill” aged 20.

Ok, that worked.

Now we will load “jill”.

Ok, that worked fine also.

As we don’t have our PersonHistory function, we can’t use that contract with this design.

Implementing V2 using our initial a data migration strategy

Here is our chaincode in v2.

//V2

package public

import (
 "encoding/json"
 "errors"
 "fmt"
 "strings"
 "test-chaincode/capabilities/models"

 "github.com/hyperledger/fabric-chaincode-go/pkg/cid"
 "github.com/hyperledger/fabric-contract-api-go/contractapi"
 "github.com/hyperledger/fabric/common/flogging"
)

// TsContract provides the public test contract implementation
type TsContract struct{}

// Ping returns {ping: 'pong'}
func (c *TsContract) Ping() string {
 return "{\"ping\": \"pong\"}"
}

type Person struct {
 Name    string `json:"name" validate:"required"`
 Age     uint   `json:"age" validate:"required"`
 Surname string `json:"surname"`
}

type PersonV1 struct {
 Name string `json:"name" validate:"required"`
 Age  uint   `json:"age" validate:"required"`
}

type PersonV2 struct {
 Name    string `json:"name" validate:"required"`
 Age     uint   `json:"age" validate:"required"`
 Surname string `json:"surname"`
}

var logger = flogging.MustGetLogger("test_contract")

// CreateCompositeKey creates a versioned composite key that identifies persons with a specific version
func (c *TsContract) CreateVersionedCompositeKey(
 ctx contractapi.TransactionContextInterface,
 name string,
 version string,
) (compositekey string, err error) {
 attributes := []string{version, name}
 compositekey, err = ctx.GetStub().CreateCompositeKey("Person", attributes)
 if err != nil {
  return
 }

 return
}

// StorePerson stores a person in the ledger
func (c *TsContract) StorePerson(
 ctx contractapi.TransactionContextInterface,
 name string,
 age uint,
 surname string,
) (result Person, err error) {
 result.Age = age
 result.Name = name
 result.Surname = surname

 serializedPerson, encodeError := json.Marshal(result)
 if encodeError != nil {
  err = encodeError
 }

 key, err := c.CreateVersionedCompositeKey(ctx, name, "V2")

 if internalError := ctx.GetStub().PutState(key, []byte(serializedPerson)); internalError != nil {
  err = errors.New("Failed to store data: " + err.Error())
 }

 return
}

// LoadPerson loads a person from the ledger
func (c *TsContract) LoadPerson(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result Person, err error) {
 personQueryAsBytes := fmt.Sprintf(`{"selector":{"name":"%s"}}`, name)
 personIterator, err := ctx.GetStub().GetQueryResult(personQueryAsBytes)
 if err != nil {
  return
 }

 defer personIterator.Close()

 var person Person
 for personIterator.HasNext() {
  personAsKeyValue, iteratorErr := personIterator.Next()

  if iteratorErr != nil {
   return result, iteratorErr
  }

  if strings.Contains(personAsKeyValue.Key, "V1") {
   var personv1 PersonV1
   if err = json.Unmarshal(personAsKeyValue.Value, &personv1); err != nil {
    return
   }

   person.Name = personv1.Name
   person.Age = personv1.Age

  } else if strings.Contains(personAsKeyValue.Key, "V2") {
   var personv2 PersonV2
   if err = json.Unmarshal(personAsKeyValue.Value, &personv2); err != nil {
    return
   }

   person.Name = personv2.Name
   person.Age = personv2.Age
   person.Surname = personv2.Surname
  } else {
   var personv1 PersonV1
   if err = json.Unmarshal(personAsKeyValue.Value, &personv1); err != nil {
    return
   }

   person.Name = personv1.Name
   person.Age = personv1.Age
  }
 }

 result = person

 return
}

// UpdatePersonToLatestVersion updates a person to the latest chaincode data model
func (c *TsContract) UpdatePersonToLatestVersion(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result Person, err error) {
 personQueryAsBytes := fmt.Sprintf(`{"selector":{"name":"%s"}}`, name)
 personIterator, err := ctx.GetStub().GetQueryResult(personQueryAsBytes)
 if err != nil {
  return
 }

 defer personIterator.Close()

 var person Person
 for personIterator.HasNext() {
  personAsKeyValue, iteratorErr := personIterator.Next()

  if iteratorErr != nil {
   return result, iteratorErr
  }

  if strings.Contains(personAsKeyValue.Key, "V1") {
   var personv1 PersonV1
   if err = json.Unmarshal(personAsKeyValue.Value, &personv1); err != nil {
    return
   }

   person.Name = personv1.Name
   person.Age = personv1.Age
   person.Surname = "unknown"

  } else if strings.Contains(personAsKeyValue.Key, "V2") {
   var personv2 PersonV2
   if err = json.Unmarshal(personAsKeyValue.Value, &personv2); err != nil {
    return
   }

   person.Name = personv2.Name
   person.Age = personv2.Age
   person.Surname = personv2.Surname
  } else {
   var personv1 PersonV1
   if err = json.Unmarshal(personAsKeyValue.Value, &personv1); err != nil {
    return
   }

   person.Name = personv1.Name
   person.Age = personv1.Age
  }
 }

 serializedPerson, encodeError := json.Marshal(person)
 if encodeError != nil {
  err = encodeError
 }

 key, err := c.CreateVersionedCompositeKey(ctx, name, "V1")

 if internalError := ctx.GetStub().PutState(key, []byte(serializedPerson)); internalError != nil {
  err = errors.New("Failed to store data: " + err.Error())
 }

 result = person

 return
}

We can still retrieve “jill” with this code:

We can also create “frank smith” aged 22 with this code:

And retrieve “frank”:

We can also update “jill” to the newest version:

Flaws of this data migration design

There are a couple of bad downsides to this design, so we are not even going to implement a V3 (although the idea is clear as to how would that work looking at the V1 vs V2 example). These are the downsides (probably there are others… as the design is quite naive):

We have lost the ability to load a record history using the out-of-the-box functionality of HLF.
We need to deal with a record that will have multiple composite keys (e.g.: a person named Peter will be stored as V1#Peter , V2#Peter , etc.) and this will be a nightmare in the long run.

It’s clear that the design has some “upsides” but the downsides just make it a no-go (at least for me). But still, we can give a little twist to the design to make it more robust and tackle the downsides.

A proper data migration strategy

Ok, we were close with our initial data migration strategy, but we saw it has some fundamental flaws that make it non-usable. Let’s give things a twist.

What if we try to make our store/load data strategy be able to deal with “any” person data structure. How can we do this? Let’s imagine that our person data model changes a bit. We won’t be using the person struct to structure the data itself, we will create a flexible enough structure that will let us deal with any situation using a specific field to store the specific data and using other fields of the model just to store meta-data that will help us deal with the specific data model version we are using.

Implementing our proper data migration

Let’s take a look at a possible implementation (for simplicity reasons, we have merged all version up to V3 in the same code… just for educational purposes and to understand the idea behind this data modeling strategy).

//V1, V2 and V3 (all in one)

package public

import (
 "encoding/json"
 "errors"
 "fmt"
 "test-chaincode/capabilities/models"

 "github.com/hyperledger/fabric-chaincode-go/pkg/cid"
 "github.com/hyperledger/fabric-contract-api-go/contractapi"
 "github.com/hyperledger/fabric/common/flogging"
)

// TsContract provides the public test contract implementation
type TsContract struct{}

// Ping returns {ping: 'pong'}
func (c *TsContract) Ping() string {
 return "{\"ping\": \"pong\"}"
}

type PersonContainer struct {
 Metadata       PersonMetadata       `json:"metadata" validate:"required"`
 SerializedData PersonSerializedData `json:"serialized_data" validate:"required"`
}

type PersonMetadata struct {
 Version string `json:"version" validate:"required"`
}

type PersonSerializedData struct {
 Data []byte `json:"data" validate:"required"`
}

type Person struct {
 Name    string `json:"name" validate:"required"`
 Age     string `json:"age" validate:"required"`
 Surname string `json:"surname"`
}

type PersonV1 struct {
 Name string `json:"name" validate:"required"`
 Age  uint   `json:"age" validate:"required"`
}

type PersonV2 struct {
 Name    string `json:"name" validate:"required"`
 Age     uint   `json:"age" validate:"required"`
 Surname string `json:"surname"`
}

type PersonV3 struct {
 Name    string `json:"name" validate:"required"`
 Age     string `json:"age" validate:"required"`
 Surname string `json:"surname"`
}

var logger = flogging.MustGetLogger("test_contract")

// CreateCompositeKey creates a composite key that identifies a person
func (c *TsContract) CreatePersonCompositeKey(
 ctx contractapi.TransactionContextInterface,
 name string,
) (compositekey string, err error) {
 attributes := []string{name}
 compositekey, err = ctx.GetStub().CreateCompositeKey("Person", attributes)
 if err != nil {
  return
 }

 return
}

// StorePersonV1 stores a person in the ledger in version 1
func (c *TsContract) StorePersonV1(
 ctx contractapi.TransactionContextInterface,
 name string,
 age uint,
) (person PersonV1, err error) {

 person.Age = age
 person.Name = name

 serializedPerson, encodeError := json.Marshal(person)
 if encodeError != nil {
  err = encodeError
 }

 var personContainer PersonContainer
 personContainer.Metadata.Version = "V1"
 personContainer.SerializedData.Data = serializedPerson

 serializedPersonContainer, encodeError := json.Marshal(personContainer)
 if encodeError != nil {
  err = encodeError
 }

 key, err := c.CreatePersonCompositeKey(ctx, name)

 if internalError := ctx.GetStub().PutState(key, []byte(serializedPersonContainer)); internalError != nil {
  err = errors.New("Failed to store data: " + err.Error())
 }

 return
}

// StorePersonV2 stores a person in the ledger in version 2
func (c *TsContract) StorePersonV2(
 ctx contractapi.TransactionContextInterface,
 name string,
 age uint,
 surname string,
) (person PersonV2, err error) {

 person.Name = name
 person.Age = age
 person.Surname = surname

 serializedPerson, encodeError := json.Marshal(person)
 if encodeError != nil {
  err = encodeError
 }

 var personContainer PersonContainer
 personContainer.Metadata.Version = "V2"
 personContainer.SerializedData.Data = serializedPerson

 serializedPersonContainer, encodeError := json.Marshal(personContainer)
 if encodeError != nil {
  err = encodeError
 }

 key, err := c.CreatePersonCompositeKey(ctx, name)

 if internalError := ctx.GetStub().PutState(key, []byte(serializedPersonContainer)); internalError != nil {
  err = errors.New("Failed to store data: " + err.Error())
 }

 return
}

// StorePersonV3 stores a person in the ledger in version 3
func (c *TsContract) StorePersonV3(
 ctx contractapi.TransactionContextInterface,
 name string,
 age string,
 surname string,
) (person PersonV3, err error) {
 person.Name = name
 person.Age = age
 person.Surname = surname

 serializedPerson, encodeError := json.Marshal(person)
 if encodeError != nil {
  err = encodeError
 }

 var personContainer PersonContainer
 personContainer.Metadata.Version = "V3"
 personContainer.SerializedData.Data = serializedPerson

 serializedPersonContainer, encodeError := json.Marshal(personContainer)
 if encodeError != nil {
  err = encodeError
 }

 key, err := c.CreatePersonCompositeKey(ctx, name)

 if internalError := ctx.GetStub().PutState(key, []byte(serializedPersonContainer)); internalError != nil {
  err = errors.New("Failed to store data: " + err.Error())
 }

 return
}

// LoadPerson loads a person from the ledger
func (c *TsContract) LoadPerson(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result Person, err error) {

 key, err := c.CreatePersonCompositeKey(ctx, name)
 personContainerBytes, err := ctx.GetStub().GetState(key)

 if err != nil {
  return
 }

 if personContainerBytes == nil {
  err = errors.New(fmt.Sprintf("failed to retrieve the person %s from the world state", name))
  return
 }

 var personContainer PersonContainer
 err = json.Unmarshal(personContainerBytes, &personContainer)
 if err != nil {
  return
 }

 switch personVersion := personContainer.Metadata.Version; personVersion {
 case "V1":
  var personv1 PersonV1
  err = json.Unmarshal(personContainer.SerializedData.Data, &personv1)
  if err != nil {
   return
  }
  result.Age = fmt.Sprintf("%d y", personv1.Age)
  result.Name = personv1.Name
  result.Surname = "unknown"
 case "V2":
  var personv2 PersonV2
  err = json.Unmarshal(personContainer.SerializedData.Data, &personv2)
  if err != nil {
   return
  }
  result.Age = fmt.Sprintf("%d y", personv2.Age)
  result.Name = personv2.Name
  result.Surname = personv2.Surname
 case "V3":
  var personv3 PersonV3
  err = json.Unmarshal(personContainer.SerializedData.Data, &personv3)
  if err != nil {
   return
  }
  result.Age = personv3.Age
  result.Name = personv3.Name
  result.Surname = personv3.Surname
 }

 return
}

// PersonHistory loads a person history from the ledger
func (c *TsContract) PersonHistory(
 ctx contractapi.TransactionContextInterface,
 name string,
) (result []Person, err error) {
 key, err := c.CreatePersonCompositeKey(ctx, name)
 personHistoryIterator, err := ctx.GetStub().GetHistoryForKey(key)
 if err != nil {
  return
 }

 defer personHistoryIterator.Close()

 for personHistoryIterator.HasNext() {
  personAsKeyValue, iteratorErr := personHistoryIterator.Next()

  if err != nil {
   return result, iteratorErr
  }

  if personAsKeyValue.IsDelete {
   continue
  }

  var personContainer PersonContainer
  err = json.Unmarshal(personAsKeyValue.Value, &personContainer)
  if err != nil {
   return
  }

  var person Person
  switch personVersion := personContainer.Metadata.Version; personVersion {
  case "V1":
   var personv1 PersonV1
   err = json.Unmarshal(personContainer.SerializedData.Data, &personv1)
   if err != nil {
    return
   }
   person.Age = fmt.Sprintf("%d", personv1.Age)
   person.Name = personv1.Name
   person.Surname = ""
  case "V2":
   var personv2 PersonV2
   err = json.Unmarshal(personContainer.SerializedData.Data, &personv2)
   if err != nil {
    return
   }
   person.Age = fmt.Sprintf("%d", personv2.Age)
   person.Name = personv2.Name
   person.Surname = personv2.Surname
  case "V3":
   var personv3 PersonV3
   err = json.Unmarshal(personContainer.SerializedData.Data, &personv3)
   if err != nil {
    return
   }
   person.Age = personv3.Age
   person.Name = personv3.Name
   person.Surname = personv3.Surname
  }

  result = append(result, person)
 }
 return
}

Some highlights regarding the code above:

The PersonHistory function is back in the game.
LoadPerson does not need an iterator anymore.
CreatePersonCompositeKey doesn’t embed version inside the record key.
As already mentioned, we have divided the StorePerson function into 3 different functions (to simulate the storage of the 3 versions of our data model for educational reasons… in a true implementation we wouldn’t have these 3 functions, just one that stores the person in the latest version as we have for the other examples above in the article).

Now let’s see if this works.

Ok, we are definitely able to store persons in V1, V2 and V3.

Now let’s try to retrieve them.

Ok, this is getting better by the minute. We are able to retrieve persons in any version.

Let’s now retrieve their history (we will do it just with “luigi” to save some post space). First we update Luigi’s data and we take the chance to update him to V3.

Now let’s retrieve Luigi’s history.

Success!!

This design is definitely better than the previous two. These would be the main benefits with regards to the other designs:

This data-model design is way more flexible than the previous two and will be able to endure much better regarding new business requirements that have to do with data changes.
We are storing data using a “predictable” and non-complex key (we are using the person’s name as the key of the record and that information is something we already have without having to query the blockchain to find the version).
We have recovered the PersonHistory function and we are leveraging the full power and potential of Hyperledger Fabric using a single call to the GetHistoryForKey function. - Let’s add a brief “disclaimer” to the PersonHistory implementation: for simplicity reasons and to not overcomplicate too much the function, all objects in the returned array are “converted” to the latest version of the model, although this function could have been implemented in a way that this array returns each object with its own version, but as said, I have avoided this to make it simpler and implement it a bit faster.

Trade-offs regarding the implementation

There are not many trade-offs with this implementation. It’s flexible and powerful, it’s extendable and will be able to manage nearly any situation (obviously this code has been created quickly and without much thinking, so bear with me if there is any code styling you disagree with… what I want you to take away is the strategy, not the code itself 😃).

The main trade-off is the complexity. Even though the design is flexible and powerful, it’s not as straight forward as directly storing the person instance in the ledger. I would recommend this approach specially for applications where business rules are immature and where they might change quickly (but as always, think if you really need this or you can get away with another strategy).

Other alternatives

Although data migration strategies are a good, they will come with a big downside: they will make our chaincode a bit more complex to implement in the first place, and they could make the maintenance and evolution of the chaincode be quite costly.

Here are a couple of ideas of other things we could do to deal with data model updates:

Always make your data model extend in a way that data models from previous versions are always retro-compatible. With this idea in mind and thinking on our person’s model, we could have decided to implement the version iterations like so:

//V1
type Person struct {
    Name        string   `json:"name"`
    Age         uint     `json:"age"`
}

//V2
type Person struct {
    Name        string   `json:"name"`
    Surname     string   `json:"surname"`
    Age         uint     `json:"age"`
}

//V3
type Person struct {
    Name        string   `json:"name"`
    Surname     string   `json:"surname"`
    Age         uint   `json:"age"`
    AgeAsString string   `json:"ageAsString"`
}

Be aware that data models can get messy if implemented this way, but for simple business logic chain codes where version iterations are not expected, this might be enough (everything depends on how complex we need our chaincode to be, given the product we are working on and our capability to predict the future).
This idea is more of a theory than a proven fact, and most probably is an incorrect assumption, but it’s worthwhile mentioning: We could potentially use the cross chaincode invoke capabilities of HLF to invoke the Query System Chaincode or QSCC (which is installed by default in all HLF peers). Whilst this approach remains untested (probably to be done as part of another article) we could potentially retrieve the chaincode version of a given transaction before unmarshalling the value to an object in our chaincode. The idea would again be to use the StateQueryIteratorInterface . For each returned object, we could use the TxId attribute and the QSCC could be invoked to retrieve further data of the transaction (with the hopes that some information of the transaction could potentially include the specific chaincode version that was used to write the transaction). Sadly enough, the QSCC documentation that HLF provides is scarce (to say the least) so this investigation will need to remain unsolved until we have the quality time to test it over. Whilst this theory and idea is not bad, there are a couple of restrictions that could prevent us from using this approach (e.g.: if the channel is configured to restrict the usage of QSCC using channel ACLs), so, it is not a universal approach and hence we have not tried to implement it.

Final conclusions

Let’s brief the hot topics we have touched upon in this article to make a final statement around the topic:

There is no one-size-fits all when dealing with data models in chaincode contracts. Each business case will be different so try to arrange something with the business such that the data model remains unaltered or at least that the first versions include a model that will withstand the pass of time and the future iterations.
Data models in chain codes can be a complicated topic, specially for situations where the business requirements will change and grow fast, so it’s better to design your data models with care before jumping to implement your contracts.
If you are dealing with a product where the data model is predicted to change and grow, spend time thinking about your data modeling/migration strategy. Try to come up with a generic enough approach for your model and let it be flexible and expandable taking into consideration the features of a blockchain (immutability). Take some time and think how you will want to deal with data migrations in the future.
If possible, version your data models and put special effort on designing your chaincode data model with extreme care and love… always plan ahead and don’t forget to take care about the future: think of a way to make your life easier when dealing with iterations of the model.
Do not despair, there will always be a way (with more or less effort) to deal with your data migrations in the future. But (just in case I have not repeated this enough times) THINK AND DESIGN before you jump into your implementation.