The semantics of Marshalling in Go

March 26, 2021

In this post, I’d like to discuss how Marshalling/Unmarshalling process is usually applied in Go programs. I’ll give some examples of how it is used in practice with some code examples (i.e. data parsing) and then expand on other usages when we have the need for transforming data across different domains.

Before discussing the term Marshalling, I’d like to start with the word data in the context of programming languages. More often than not we programmers have a task that involves in some shape or form interacting with data represented in a format and then we need to transform this data into a different representation.

A few use-cases that come to mind:


Fetch data from this HTTP API and convert it into my own type to then do something…


As soon as this “data” arrives, transform them into something else, which is what our system understands…


In order to integrate with certain third-party API, I need to convert this “data” into something that the third-party API understands to then send the data…

What should we call this?

When it comes to changing data’s representation into another representation in Software, I would imagine some of these terms would ring a bell: Encode/Decode, Marshal/Unmarshal, Map, Normalise (also Normalize), Parse, Reduce, Serialise (also Serialize), Transform, etc.

You might notice that different programming languages use one more than another and, in Go, the terms Marshal and Unmarshal are usually preferred. In this post, I’d like to explore or try to guess why this is the case and what all of these terms have in common.

The term Marshalling

In the context of computer science and programming languages, the term marshalling is the process of transforming a data representation into a suitable format that will be used by a different Software Component.

This process is quite interesting because it enables interoperability between different Softwares. Consider Software X and Software Y, they are totally independent of one another. However, they can communicate with each other via message passing and they can exchange data, as long as they establish a common contract or interface.

i.e.

Software X –> Marshal data "abc" into "<abc>"
Software Y –> Unmarshal data "<abc>" into "A, B, C"

How Marshal/Unmarshal can be used in Go

In Go, one of the most famous examples you might find is converting data from a JSON structure into Go code (in other words, JSON “parsing”).

What about the term Parsing?

There is a quite popular term that can be seen as similar, but don’t assume that just yet, which is parsing. I would say that the semantics of parsing is more applicable when the data to be processed are strings/symbols that will then be transformed into a custom representation throughout the programming language in question.

While terms like Marshalling and Encoding offer a bit more scope to work with. Given you need to transform type A to type B while type A is a binary, I personally think that calling a “parser” doesn’t feel right.

Back to Marshalling and Unmarshalling

It took a bit longer but the term “Marshalling” clicked for me, eventually. When I first came to learn Go, call the method Unmarshal to “parse” JSON felt really weird and too much low-level.

However, Marshal/Unmarshal can and should be used as a generic way to transform types between boundaries regardless of the level of abstraction (i.e. network, application, domain, platform).

Converting from one format to another

When dealing with Distributed Systems, converting different representations from one domain to another can be quite common. And I’m not only referring to business domains (i.e. Shopping cart, Billing, Invoice) but also to different transport layers (i.e. Network, Protocols).

Data format transformations (i.e. JSON, XML, etc)

We can unmarshal from type A to type B when transporting data over the wire or when converting from a generic data representation to a domain-specific representation.

So let’s imagine we are building a system that manages user’s accounts and we have the following use-cases to cover:

Code examples

Unit tests (click to expand)

func Test_Unmarshal(t *testing.T) {
	tests := []struct {
		name string
		in   interface{}
		out  Account
	}{
		{
			name: "should unmarshal Account from XML",
			in: []byte(`
			<?xml version="1.0" encoding="UTF-8"?>
			<User>
				<Name>Mary</Name>
				<Type>Gold</Type>
			</User>
			`),
			out: Account{
				Name: "Mary",
			},
		},

		{
			name: "should unmarshal Account from JSON",
			in: []byte(`
			{
				"user": {
					"name": "John",
					"type": "Premium",
					"created_at": "2020-01-01"
				}
			}
			`),
			out: Account{
				Name: "John",
			},
		},

		{
			name: "should unmarshal Account from protobuf",
			in: ProtobufAccount{
				AccountName: "Bob",
			},
			out: Account{
				Name: "Bob",
			},
		},

		{
			name: "should not unmarshal from CSV format since it doesn't support yet",
			in:   []byte(`Name,Mary`),
			out:  Account{},
		},

		// ...
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			var a Account
			err := a.Unmarshal(tt.in)
			if err != nil {
				expectedError := fmt.Errorf("unable to unmarshal %v, format not supported.", tt.in)

				if err.Error() != expectedError.Error() {
					t.Errorf("unexpected error while trying to unmarshal account: %v", err)
				}
			}

			if tt.out != a {
				t.Errorf("test failed. wanted: %v, got: %v", tt.out, a)
			}
		})
	}
}

Implementation (click to expand)

// xmlPayload represents the XML payload with user data
type xmlPayload struct {
	Name string `xml:User>Name`
}

// jsonPayload represents the JSON payload wiht user data
type jsonPayload struct {
	User struct {
		Name string `json:"name"`
	} `json:"user"`
}

// ProtobufAccount represents a generated Go code from a protobuf definition
// More details: https://developers.google.com/protocol-buffers/docs/reference/go-generated
type ProtobufAccount struct {
	AccountName string `protobuf:"bytes,1,opt,name=acount_name,proto3"`
}

// Account is the domain representation of an account in the system. It is the
// main entity for the example we're using.
type Account struct {
	Name string
}

func (a *Account) unmarshalXML(src []byte) error {
	var res xmlPayload
	err := xml.Unmarshal(src, &res)
	a = &Account{
		Name: res.Name,
	}
	return err
}

func (a *Account) unmarshalJSON(src []byte) error {
	var res jsonPayload
	err := json.Unmarshal(src, &res)
	a = &Account{
		Name: res.User.Name,
	}
	return err
}

func (a *Account) unmarshalProtobuf(src ProtobufAccount) error {
	a = &Account{
		Name: src.AccountName,
	}

	return nil
}

// Unmarshal takes an interface{} and try to convert into Account type
func (a *Account) Unmarshal(src interface{}) error {
	switch src.(type) {
	case []byte:
		b := src.([]byte)

		// naive logic to determine if it is XML
		if strings.Contains(string(b), "<?xml") {
			return a.unmarshalXML(b)
		}

		// naive logic to determine if it is JSON
		if strings.Contains(string(b), `"user":`) {
			return a.unmarshalJSON(b)
		}
	case ProtobufAccount:
		return a.unmarshalProtobuf(src.(ProtobufAccount))
	}

	return fmt.Errorf("unable to unmarshal %v, format not supported.", src)
}

Converting from one domain to another

In this particular case, let’s imagine we have an e-commerce system that goes from adding item to a cart, placing an order and sending an invoice to the customer.

Once the customer selects an item and adds it to her/his cart, the item will be part of an order that will also be listed in the invoice afterwards. However, the “item” might not have the same meaning across the system.

I would suggest a read on Bounded Context for more details. It is described in more depth in the book Domain-Driven-Design

Without further ado, let’s check how Marshal and Unmarshal can be used to translate different data representations across different domains:

Because now the components are now communicating via a common interface (Marshal/Unmarshal), the data format that is used to transfer between components became merely an implementation detail. It no longer matters if an item, when unmarshalled to an Order, will be a JSON payload, protobuf or a just data transformation between go types. As the system supports different ways of data transformation (new domains being introduced, new formats, new integrations), the core flow doesn’t need to change, because now Marshal and Unmarshal abstracts that away.

Final thoughts

Although the Marshalling terminology might sound too low level (i.e. feels more about computation, less about business domain), it is a consistent way to transform data from one format to another, where this transformation can be from low-level bytes into an “object” or it can be used to translate from one domain representation to another.

One might argue that having a single interface to transform data between domains is nothing novel. However, having this semantics established is quite useful so developers won’t need to come up with new conventions all over again. Although I appreciate always having new ways to express real-life problems in form of code, I believe that data transformation, most of the time, are stepping stones for problem-solving, not the end goal. So if there is a standard way to deal with this mundane task, I found it very welcome.

But hey, this is only my opinion at the time of this post! If you read this far, I hope I didn’t waste your time!

Thank you for reading

I hope you enjoy this post, if you have any feedback or questions, hit me up on alabeduarte@gmail.com, I’d be happy to hear your thoughts and be better next time!

References