idiomatic codec and rpc lib for msgpack, cbor, json, etc. msgpack.org[Go]
MIT License
1706
52
286

As of this commit: ebaaab4 arrays with zero values ar treated as empty. Whether this is correct or not is up for debate (see golang/go#29310), however it makes the current behaviour inconsistent with the standard json.Marshal() which makes it more difficult to use ugorji as a replacement.

I personally would argue that array of zero values is not empty. It's true that for Go arrays there is no distinction (meaning a Go array cannot be empty), however when doing un-typed unmarshalling the difference is between an undefined/missing value and an array of zeros.

go/codec/decode.go

Lines 316 to 319 in 3cfaa07

// RawToString controls how raw bytes in a stream are decoded into a nil interface{}.
// By default, they are decoded as []byte, but can be decoded as string (if configured).
RawToString bool

states

	// RawToString controls how raw bytes in a stream are decoded into a nil interface{}.
	// By default, they are decoded as []byte, but can be decoded as string (if configured).
	RawToString bool

But in the old msgpack spec the raw types have op-codes 0xda for raw 16, oxdb for raw 32, and anything starting with the binary prefix 101 for fix raw. These opcodes are equal to the corresponding string types in the new spec.

What the RawToString flag does is it forces Binary types bin8, bin16, bin32 to be decoded as the corresponding string type.
It would be less confusing if RawToString was renamed BinToString. Changing the name at this point is probably not worth the pain. But It would be useful if the comment was updated.

2 vulnerability issues have been found in the module. Veracode Security Scanner reports it as medium level flaw.
Please, fix the issues.

CWE-73: External Control of File Name or Path: vendor/github.com/ugorji/go/codec/test.py:77
60Details: This call to open() contains a path manipulation flaw. The argument to the function is a filename constructed using user-supplied input. If an attacker is allowed to specify all or part of the filename, it may be possible to gain unauthorized access to files on the server, including those outside the webroot, that would be normally be inaccessible to end users. The level of exposure depends on the effectiveness of input validation routines, if any. Validate all user-supplied input to ensure that it conforms to the expected format, using centralized data validation routines when possible. When using blocklists, be sure that the sanitizing routine performs a sufficient number of iterations to remove all instances of disallowed characters. References: CWE OWASP
61CWE-73: External Control of File Name or Path: vendor/github.com/ugorji/go/codec/test.py:80
62Details: This call to open() contains a path manipulation flaw. The argument to the function is a filename constructed using user-supplied input. If an attacker is allowed to specify all or part of the filename, it may be possible to gain unauthorized access to files on the server, including those outside the webroot, that would be normally be inaccessible to end users. The level of exposure depends on the effectiveness of input validation routines, if any. Validate all user-supplied input to ensure that it conforms to the expected format, using centralized data validation routines when possible. When using blocklists, be sure that the sanitizing routine performs a sufficient number of iterations to remove all instances of disallowed characters. References: CWE OWASP
63========================
64FAILURE: Found 2 issues!

Hi Ugorji, little story

I had to deal with a scenario where I have to decode a map inside a msgpack, remove one item, and reencode it. But the map had nested maps, some using string and others using uint64 as the key types.

Because of this, I was forced to use map[interface{}]interface{} but then I was unable to encode it again using the canonical sort because the code does not handle interface{} keys.

I ended doing this piece of code that might help anyone:


package main

import (
	"bytes"
	"go/format"
	"io/ioutil"
	"log"
	"os"
	"strings"
	"text/template"
)

// -----------------------------------------------------------------------------

type TemplateConfig struct {
	Types  []string
}

// -----------------------------------------------------------------------------

func main() {
	funcMap := template.FuncMap{
		"capitalize": func(s string) string {
			// NOTE: no need to extract runes because always ansi characters are used
			return strings.ToUpper(s[:1]) + s[1:]
		},
	}

	// Generator template
	tmpl := template.Must(template.New("").Funcs(funcMap).Parse(`package codec

// Code generated by go generate; DO NOT EDIT.

import (
	"errors"
	"reflect"
)

// -----------------------------------------------------------------------------

// RecreateMap tries to identify the key type for each map (root or nested) and builds a new map using the proper key type.
func RecreateMap(src interface{}) (dest interface{}, err error) {
	defer func() {
		if r := recover(); r != nil {
			err = r.(error)
		}
	}()
	dest, err = recreateMapInternal(src)
	return
}

func recreateMapInternal(src interface{}) (interface{}, error) {
	var ok bool

	if reflect.ValueOf(src).Kind() != reflect.Map {
		return src, nil
	}

{{range $type := .Types}}
	_, ok = src.(map[{{$type}}]interface{})
	if ok {
		return recreateMap{{capitalize $type}}(src.(map[{{$type}}]interface{}))
	}
{{end}}

	var srcMapAsInterface map[interface{}]interface{}
	srcMapAsInterface, ok = src.(map[interface{}]interface{})
	if !ok {
		return nil, errors.New("unsupported map type")
	}
	for key := range srcMapAsInterface {
		switch key.(type) {
{{range $type := .Types}}
		case {{$type}}:
			return recreateGenericMap{{capitalize $type}}(srcMapAsInterface)
{{end}}
		}
		break
	}

	return nil, errors.New("unsupported map type")
}

{{range $type := .Types}}
func recreateMap{{capitalize $type}}(src map[{{$type}}]interface{}) (interface{}, error) {
	var err error

	dest := make(map[{{$type}}]interface{})

	for key, value := range src {
		dest[key], err = recreateMapInternal(value)
		if err != nil {
			return nil, err
		}
	}
	return dest, nil
}

func recreateGenericMap{{capitalize $type}}(src map[interface{}]interface{}) (interface{}, error) {
	var err error

	dest := make(map[{{$type}}]interface{})

	for key, value := range src {
		keyName, ok := key.({{$type}})
		if !ok {
			return nil, errors.New("mixed key types on map")
		}

		dest[keyName], err = recreateMapInternal(value)
		if err != nil {
			return nil, err
		}
	}
	return dest, nil
}
{{end}}`))

	tmplConfig := TemplateConfig{
		Types:  []string{
			"string",
			"uint8",
			"uint16",
			"uint32",
			"uint64",
			"int8",
			"int16",
			"int32",
			"int64",
		},
	}

	output := &bytes.Buffer{}
	err := tmpl.Execute(output, tmplConfig)
	if err != nil {
		log.Fatalf("Error executing template [err=%v]", err)
	}

	var data []byte

	data, err = format.Source(output.Bytes())
	if err != nil {
		log.Fatalf("Error formatting generated code [err=%v]", err)
	}

	err = ioutil.WriteFile("generated_map_rebuilder.go", data, os.ModePerm)
	if err != nil {
		log.Fatalf("Error writing generated file [err=%v]", err)
	}
}```

Kind regards,
Mauro.

as the follow picture
image
add go.sum? thanks!

File:
Error Details:
The indicated dead code may have performed some action; that action will never occur.

In github.​com/ugorji/go/codec.​checkOverflow.​SignedInt(uint64, bool): Code can never be reached because of a logical contradiction.

Screenshot:
image

GO version : 1.17, Image : go version: 1.17, Image: us-docker.pkg.dev/google.com/api-project-999119582588/go-boringcrypto/golang:1.17.11b7

Hi, I attached a sample project in order to show the issue.

The problem appears when a package tries to use code generated in other package.

To reproduce the issue follow this steps:

  1. Run go mod tidy and go mod vendor to download the dependencies. (vendoring is not required but used here to encapsulate the test)
  2. Run codecgen -nx -o test2_codec.go test2.go in the test2 subdirectory.
  3. Run codecgen -nx -o test_codec.go test.go in the "root" directory.
  4. Run go test

You will get something like this:

# test.com/test
.\test_codec.go:77:11: yysf5.CodecEncodeSelf undefined (type **test2.A has no field or method CodecEncodeSelf)
.\test_codec.go:101:11: yysf7.CodecEncodeSelf undefined (type **test2.A has no field or method CodecEncodeSelf)
.\test_codec.go:227:12: yysf4.CodecEncodeSelf undefined (type **test2.A has no field or method CodecEncodeSelf)
.\test_codec.go:254:12: yysf5.CodecEncodeSelf undefined (type **test2.A has no field or method CodecEncodeSelf)
FAIL    test.com/test [build failed]

I was able to patch it by replacing this line:

if ti2.flagSelfer {

with:

if ti2.flagSelfer || (isptr && ti2.flagSelferPtr) {

but not sure if the right thing to do.

Kind regards,
Mauro.

Hi, I have to use codec to receive JSON format data from clients, and save it into database in Msgpack format. Then restore the object to JSON format and send back to clients if they need it.

The data struct likes that:

type Sample struct {
	KV map[string]interface{} `json:"kv" codec:"kv"`
}

I use a simple code to demostrate what I have to do:

package model

import (
	"fmt"
	"github.com/stretchr/testify/assert"
	"github.com/ugorji/go/codec"
	"testing"
)

type Sample struct {
	KV map[string]interface{} `json:"kv" codec:"kv"`
}

func TestModel(t *testing.T) {
	var data = `{"kv": {"a": "b", "c": 1, "d": {"e": "f"}}}`
	var mh codec.MsgpackHandle
	var jh codec.JsonHandle

	// first step, decode data as JSON format
	var s Sample
	var dec = codec.NewDecoderString(data, &jh)
	assert.Nil(t, dec.Decode(&s))

	// second step, encode data as Msgpack format
	var data2 []byte
	var enc = codec.NewEncoderBytes(&data2, &mh)
	assert.Nil(t, enc.Encode(&s))

	// 3rd step, decode data as Msgpack format
	var s2 Sample
	dec = codec.NewDecoderBytes(data2, &mh)
	assert.Nil(t, dec.Decode(&s2))

	// 4th step, encode data as JSON format
	var data3 []byte
	enc = codec.NewEncoderBytes(&data3, &jh)
	assert.Nil(t, enc.Encode(&s2))

	// full flow is [json decode]->[msgpack encode]->[msgpack decode]->[json encode]
	fmt.Println(string(data3))
}

The full data flow is:

But the problem is that, my input JSON data is {"kv": {"a": "b", "c": 1, "d": {"e": "f"}}}, and the result JSON data is {"kv":{"d":{"e":"Zg=="},"a":"Yg==","c":1}}, which is not equal to the original content.

I'm not sure if it is a bug in codec. But is there any solution to satisfy my situation (let input data same as output data)? Thanks~

Did you read the documentation?

Yes

What are you trying to do?

I'm reading a file that has data encoded in CBOR format and passing the bytes to decode
More specifically I'm

Decoding without knowing what is in the stream

as said in the primer

Code

// Deserialize the data
//
// Converts CBOR byte array from a file to a Data container
//  d []byte - `The data read from a file`
//
// Returns
//  data interface{} - `The deserialized data`
func Decode(d []byte) (data any) {
	var _d any
	if d == nil {
		return nil
	}

       	fmt.Println(d)

	err := codec.NewDecoderBytes(d, new(codec.CborHandle)).Decode(&_d)

	if CheckErr(err) {
		panic(err)
	}

	println(_d)

	return _d
}

Input passed (d []byte)

[163 102 100 97 116 97 95 51 161 102 100 101 112 116 104 50 161 102 100 101 112 116 104 51 111 86 97 108 117 101 32 111 102 32 100 97 116 97 32 51 102 100 97 116 97 95 49 161 102 100 101 112 116 104 50 161 102 100 101 112 116 104 51 111 86 97 108 117 101 32 111 102 32 100 97 116 97 32 49 102 100 97 116 97 95 50 161 102 100 101 112 116 104 50 161 102 100 101 112 116 104 51 111 86 97 108 117 101 32 111 102 32 100 97 116 97 32 50]

Output

$ go run .
> [163 102 100 97 116 97 95 51 161 102 100 101 112 116 104 50 161 102 100 101 112 116 104 51 111 86 97 108 117 101 32 111 102 32 100 97 116 97 32 51 102 100 97 116 97 95 49 161 102 100 101 112 116 104 50 161 102 100 101 112 116 104 51 111 86 97 108 117 101 32 111 102 32 100 97 116 97 32 49 102 100 97 116 97 95 50 161 102 100 101 112 116 104 50 161 102 100 101 112 116 104 51 111 86 97 108 117 101 32 111 102 32 100 97 116 97 32 50]
(0x8fbf60,0xc000089110)

Expected Output CBOR Playground
image

Please Help 🙏 @ugorji

EDIT: I tried to decode a sample json data using JsonHandle and it works perfectly.

As part of coverity scan , following is observed in
Details: The expression's value does not depend on the operands; often, this represents an inadvertent logic error.
In github.​com/ugorji/go/codec.​mpdesc(byte, string): An operation with non-constant operands that computes a result with constant value.
image

As part of coverity scan in file msgpack.go:
Details:

  1. The expression's value does not depend on the operands; often, this represents an inadvertent logic error.
    In github.​com/ugorji/go/codec.​msgpackDecDriver.​DecodeInt64(int64): An operation with non-constant operands that computes a result with constant value
  2. The expression's value does not depend on the operands; often, this represents an inadvertent logic error.
    In github.​com/ugorji/go/codec.​msgpackDecDriver.​DecodeNaked(): An operation with non-constant operands that computes a result with constant value.
  3. The expression's value does not depend on the operands; often, this represents an inadvertent logic error.
    In github.​com/ugorji/go/codec.​msgpackDecDriver.​DecodeUint64(uint64): An operation with non-constant operands that computes a result with constant value.
  4. The expression's value does not depend on the operands; often, this represents an inadvertent logic error.
    In github.​com/ugorji/go/codec.​mpdesc(byte, string): An operation with non-constant operands that computes a result with constant value

Screenshots:
1.
image
2.
image
3.
image
4.
image

File: https://github.com/ugorji/go/blob/master/codec/json.go
Details:The expression's value does not depend on the operands; often, this represents an inadvertent logic error.
In github.​com/ugorji/go/codec.​jsonParseInteger([]byte): An operation with non-constant operands that computes a result with constant value

screenshot:
image

File: https://github.com/ugorji/go/blob/master/codec/encode.go
Details:
A null pointer exception will occur.

In github.​com/ugorji/go/codec.​Encoder.​encode(interface{}): Dereference of an explicit null (nil) value (CWE-476)
Screenshot:
image
image
image

image

File : https://github.com/ugorji/go/blob/master/codec/encode.go
Error -
The indicated dead code may have performed some action; that action will never occur.

In github.​com/ugorji/go/codec.​Encoder.​encode(interface{}): Code can never be reached because of a logical contradiction (CWE-561)

Screenshot:
image

File : https://github.com/ugorji/go/blob/master/codec/decode.go
Error :The indicated dead code may have performed some action; that action will never occur.
In github.​com/ugorji/go/codec.​Decoder.​kMap(*github.​com/ugorji/go/codec.​codecFnInfo, reflect.​Value): Code can never be reached because of a logical contradiction (CWE-561)
Screenshot:
image

We ran into an issue today with codec 1.2.6 (also checked against 1.2.7) where invalid MsgPack with a string contains non-utf8 characters passes decoding.

Here is the relevant bit showing the invalid string that can be passed:

invalid := []byte{ 0xac /* \xed\xbf\xbf is invalid */, 0xed, 0xbf, 0xbf, 't', '-', 'a', 'd', 'd', 'r', 'e', 's', 's' }

When decoded, everything succeeds, but when utf8.ValidString() is called on the resulting string, the string is not utf8.

I'm not sure how to write up a test/example to be more helpful but wanted to bring up the potential issue.

Hi,

I tried to update from version v1.1.1 to v1.2.7, but I found a bug.
Pointer embedded structs are handled incorrectly, when the embedded struct is nil, after a encode/decode the nil struct gets a default value, instead of staying nil.

To reproduce, use this code:

go.mod:

module ugorji-test

go 1.17

require (
	github.com/davecgh/go-spew v1.1.1
	github.com/ugorji/go v1.1.1
//github.com/ugorji/go/codec v1.2.7
)

main.go

package main

import (
	"bytes"
	"fmt"
	"reflect"

	"github.com/davecgh/go-spew/spew"
	"github.com/ugorji/go/codec"
)

type Test struct {
	*Embedded
}

type Embedded struct {
	Field string
}

func main() {
	handle := &codec.MsgpackHandle{}

	orig := &Test{}
	buf := new(bytes.Buffer)

	enc := codec.NewEncoder(buf, handle)
	err := enc.Encode(orig)
	if err != nil {
		panic(err)
	}

	decoded := &Test{}

	dec := codec.NewDecoder(buf, handle)
	err = dec.Decode(decoded)
	if err != nil {
		panic(err)
	}

	if !reflect.DeepEqual(orig, decoded) {
		fmt.Printf("orig: \n%v\n", spew.Sdump(orig))
		fmt.Printf("decoded: \n%v\n", spew.Sdump(decoded))
	} else {
		fmt.Println("orig and decoded are the same")
	}
}

When using v1.1.1 the result will be the orig and decoded are the same message.
With v1.2.7:

orig: 
(*main.Test)(0xc000010030)({
 Embedded: (*main.Embedded)(<nil>)
})

decoded:
(*main.Test)(0xc000010038)({
 Embedded: (*main.Embedded)(0xc000012230)({
  Field: (string) ""
 })
})

I understand Go very little, but it seems to me bench_test.go should import helper_test.go. I used go2rpm script to prepare Fedora Go package, but it fails on all releases on check stage.

+ cd go-1.2.6
+ LDFLAGS=' -X github.com/ugorji/go/version=1.2.6'
+ GO_TEST_FLAGS='-buildmode pie -compiler gc'
+ GO_TEST_EXT_LD_FLAGS='-Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld  '
+ go-rpm-integration check -i github.com/ugorji/go -b /builddir/build/BUILD/go-1.2.6/_build/bin -s /builddir/build/BUILD/go-1.2.6/_build -V 1.2.6-1.fc35 -p /builddir/build/BUILDROOT/golang-github-ugorji-1.2.6-1.fc35.x86_64 -g /usr/share/gocode -r '.*example.*'
Testing    in: /builddir/build/BUILD/go-1.2.6/_build/src
         PATH: /builddir/build/BUILD/go-1.2.6/_build/bin:/builddir/.local/bin:/builddir/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/sbin
       GOPATH: /builddir/build/BUILD/go-1.2.6/_build:/usr/share/gocode
  GO111MODULE: off
      command: go test -buildmode pie -compiler gc -ldflags " -X github.com/ugorji/go/version=1.2.6 -extldflags '-Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld  '"
      testing: github.com/ugorji/go
github.com/ugorji/go/codec
PASS
ok  	github.com/ugorji/go/codec	0.451s
github.com/ugorji/go/codec/bench
# github.com/ugorji/go/codec/bench [github.com/ugorji/go/codec/bench.test]
./bench_test.go:76:15: undefined: approxDataSize
./bench_test.go:198:9: undefined: deepEqual
FAIL	github.com/ugorji/go/codec/bench [build failed]

More on rawhide log or my copr build

When registering an extension to encoder time, as advertised the encoder won't get picked up, regardless of the TimeNotBuiltin flag.

	writer := bytes.Buffer{}
	var json codec.JsonHandle
	json.TimeNotBuiltin = true //won't run for both true and false
	err := json.SetInterfaceExt(timeType, 1, NewTimeMarshaller()) //implements InterfaceExt

Unlike advertised, there is a hidden check that silently ignores extensions registered for time.Time. See the below code (line 947 in helper.go). I think a check for json.TimeNotBuiltin flag is missing.

func (x *basicHandleRuntimeState) setExt(rt reflect.Type, tag uint64, ext Ext) (err error) {
	rk := rt.Kind()
	for rk == reflect.Ptr {
		rt = rt.Elem()
		rk = rt.Kind()
	}

	if rt.PkgPath() == "" || rk == reflect.Interface { // || rk == reflect.Ptr {
		return fmt.Errorf("codec.Handle.SetExt: Takes named type, not a pointer or interface: %v", rt)
	}

	rtid := rt2id(rt)
	switch rtid {
	case timeTypId, rawTypId, rawExtTypId:
		// these are all natively supported type, so they cannot have an extension.
		// However, we do not return an error for these, as we do not document that.
		// Instead, we silently treat as a no-op, and return.
		return
	}

This should be changed to:

func (x *basicHandleRuntimeState) setExt(rt reflect.Type, tag uint64, ext Ext) (err error) {
         ...
	switch rtid {
	case rawTypId, rawExtTypId:
		// these are all natively supported type, so they cannot have an extension.
		// However, we do not return an error for these, as we do not document that.
		// Instead, we silently treat as a no-op, and return.
		return
	case timeTypId:
               if x.timeBuiltin {
                  //ignore time as it is supported internally
                  return
               }
	}

Please see my example below. The issue occurs when you have a certain nesting of structs with maps. In this case, we have a struct with a map of string to slice of pointer to struct with a map of string to slice of string. This appears to cause the kMap decode function, which is called recursively, to assign memory where it should not.

package main

import (
	"bytes"
	"fmt"

	"github.com/ugorji/go/codec"
)

type Inner struct {
	InnerMap map[string][]string
}

type Outer struct {
	OuterMap map[string][]*Inner
}

func reproPanic() {
	inner := &Inner{}
	inner.InnerMap = map[string][]string{
		"foo": {
			"bar",
		},
	}

	outer := &Outer{}
	outer.OuterMap = make(map[string][]*Inner)
	outer.OuterMap["blah"] = []*Inner{
		inner,
	}

	outer2 := &Outer{}

	encodeDecodeEncodeRoundtrip(outer, outer2)

	// not expected to get here
	fmt.Printf("NOT REACHED!?")
}

func encodeDecodeEncodeRoundtrip(obj, dest interface{}) {

	handle := &codec.MsgpackHandle{}
	encoder := codec.NewEncoder(nil, handle)
	decoder := codec.NewDecoder(nil, handle)

	buf := bytes.NewBuffer(nil)
	encoder.Reset(buf)
	encoder.MustEncode(obj)

	decoder.Reset(bytes.NewBuffer(buf.Bytes()))
	decoder.MustDecode(dest)

	buf = bytes.NewBuffer(nil)
	encoder.Reset(buf)
	encoder.MustEncode(dest)
}

func main() {
	reproPanic()
}