Octet Stream Encoding

Octet Stream Encoding 8,4/10 7268 reviews

Join GitHub today

GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.

When a Controller operation returns `byte[]` the content should automatically be set to `application/octet-stream` #7926.

Sign up New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments

commented Nov 9, 2017
edited

Hi, thank you for the amazing repo!

Instead of Content-Type: text/csv is it possible to set Content-Type as application/octet-stream for the attached csv files?

thank you
Radoslaw

commented Nov 9, 2017

Hey, thanks for posting. I'm not sure why you would want that :)? It would help understanding how it would fit in the code.

commented Nov 10, 2017

Thank you Pascal for your prompt reply. Based on my understanding all csv files send via yagmail are detected as 'Content-Type: text/csv; name='some_filename.csv' which is very nice feature because the content of the csv file is directly displayed inline in the body of the email.

But for my case I would like to display the attachments as the attachements. See the screenshot.

Thank you.
Have a wonderful weekend!
Radoslaw

commented Nov 11, 2017
edited

Hi Radoslaw, just a guess... but are you mentioning the csv in contents or in attachments? In case you cannot get it to work with using attachments, it's most likely a bug. You too, enjoy the weekend :-)

commented Nov 11, 2017

Hi Pascal, yes I'm using attachments. See the code and zipped csv.

smtp.send(to=myemail, subject='test1', attachments='/path/123_ARIADNA-10-2017.csv' )
123_ARIADNA-10-2017.csv.zip

Thank you
Radoslaw

added a commit that referenced this issue Nov 11, 2017

commented Nov 11, 2017

Could you try to see if it works for you with yagmail-0.10.209?

commented Nov 11, 2017

No luck.

print(yagmail.__version__)
shows: 0.10.209
but I'm still getting the same results.

here are the raw content of the email:

commented Nov 11, 2017
edited

I think I misunderstood the question!

For me in gmail, the csv shows just fine (not inline, but as attachment). But you want application/octet-stream as type? Why? Isn't the type of a .csv file 'text/csv'?

commented Nov 12, 2017
edited

Yes it works on gmail web client.
The issue has something to do with the mac Mail App (desktop & ios)
If you have macOS or iPhone when you will see that the csv attachments are displayed inline.

See the part of the raw source, email send via perl's MIME::Lite + Net:SMTP, which works:

based on my research adding Content-Disposition: attachment with Content-Type: application/csv should work
in the meantime a small amendment in sender.py did the trick ;)

changed the titleContent-Type: application/octet-stream for the csv filesNov 12, 2017

commented May 4, 2018

Had the same problem, still a bug at v0.10.212. @radoslawoska fix works.
CSV attachments sent with yagmail can't be read properly on gmail client running on apple systems on current version. Its really important ^^.
cheers!

commented May 4, 2018

I have missed this one, thanks for bringing it back to attention. Feel free to make a PR as you've already found the fix, I'd gladly accept it and put it on pypi!

added a commit to amrutadotorg/yagmail that referenced this issue Nov 5, 2018

csv files proper handling in the Mac mail client
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.
GPG key ID: 4AEE18F83AFDEB23Learn about signing commits

referenced this issue Nov 5, 2018

Open

csv files proper handling in the Mac mail client #129

commented Nov 5, 2018

thank you, PR created

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

X.690 is an ITU-T standard specifying several ASN.1 encoding formats:

  • Basic Encoding Rules (BER)
  • Canonical Encoding Rules (CER)
  • Distinguished Encoding Rules (DER)

The Basic Encoding Rules were the original rules laid out by the ASN.1 standard for encoding abstract information into a concrete data stream. The rules, collectively referred to as a transfer syntax in ASN.1 parlance, specify the exact octet sequences which are used to encode a given data item. The syntax defines such elements as: the representations for basic data types, the structure of length information, and the means for defining complex or compound types based on more primitive types. The BER syntax, along with two subsets of BER (the Canonical Encoding Rules and the Distinguished Encoding Rules), are defined by the ITU-T's X.690 standards document, which is part of the ASN.1 document series.

  • 1BER encoding
    • 1.2Identifier octets
    • 1.3Length octets
  • 4BER, CER and DER compared

BER encoding[edit]

The format for Basic Encoding Rules specifies a self-describing and self-delimiting format for encoding ASN.1 data structures. Each data element is encoded as a type identifier, a length description, the actual data elements, and, where necessary, an end-of-content marker. These types of encodings are commonly called type-length-value or TLV encodings. This format allows a receiver to decode the ASN.1 information from an incomplete stream, without requiring any pre-knowledge of the size, content, or semantic meaning of the data.[1]

Encoding structure[edit]

The encoding of data generally consists of four components which appear in the following order:

Identifier octets
Type
Length octets
Length
Contents octets
Value
End-of-contents octets

The End-of-contents octets are optional and only used if the indefinite length form is used.The Contents octet may also be omitted if there is no content to encode like in the NULL type.

Identifier octets[edit]

Types[edit]

Data (especially members of sequences and sets and choices) can be tagged with a unique tag number (shown in ASN.1 within square brackets []) to distinguish that data from other members. Such tags can be implicit (where they are encoded as the TLV tag of the value instead of using the base type as the TLV tag) or explicit (where the tag is used in a constructed TLV that wraps the base type TLV). The default tagging style is explicit, unless implicit is set at ASN.1 module-level. Such tags have a default class of context-specific, but that can be overridden by using a class name in front of the tag.

The encoding of a choice value is the same as the encoding of a value of the chosen type. The encoding may be primitive or constructed, depending on the chosen type. The tag used in the identifier octets is the tag of the chosen type, as specified in the ASN.1 definition of the chosen type..

The following tags are native to ASN.1:

Types, universal class
NameValue
encodings
Tag number
DecimalHexadecimal
End-of-Content (EOC)Primitive00
BOOLEANPrimitive11
INTEGERPrimitive22
BIT STRINGBoth33
OCTET STRINGBoth44
NULLPrimitive55
OBJECT IDENTIFIERPrimitive66
Object DescriptorBoth77
EXTERNALConstructed88
REAL (float)Primitive99
ENUMERATEDPrimitive10A
EMBEDDED PDVConstructed11B
UTF8StringBoth12C
RELATIVE-OIDPrimitive13D
TIMEPrimitive14E
Reserved15F
SEQUENCE and SEQUENCE OFConstructed1610
SET and SET OFConstructed1711
NumericStringBoth1812
PrintableStringBoth1913
T61StringBoth2014
VideotexStringBoth2115
IA5StringBoth2216
UTCTimeBoth2317
GeneralizedTimeBoth2418
GraphicStringBoth2519
VisibleStringBoth261A
GeneralStringBoth271B
UniversalStringBoth281C
CHARACTER STRINGBoth291D
BMPStringBoth301E
DATEPrimitive311F
TIME-OF-DAYPrimitive3220
DATE-TIMEPrimitive3321
DURATIONPrimitive3422
OID-IRIPrimitive3523
RELATIVE-OID-IRIPrimitive3624

The list of Universal Class tag assignments can be found at Rec. ITU-T X.680, clause 8, table 1 [2] .

Encoding[edit]

The identifier octets encode the element type as an ASN.1 tag, consisting of the class and number, and whether the contents octets represent a constructed or primitive value.Note that some types can have values with either primitive or constructed encodings.It is encoded as 1 or more octets.

Octet 1Octet 2 onwards
8765432187654321
Tag classP/CTag number (0–30)N/A
31MoreTag number

In the initial octet, bit 6 encodes whether the type is primitive or constructed, bit 7–8 encode the class of the type, and bits 1–5 encode the tag number.The following values are possible:

ClassValueDescription
Universal0The type is native to ASN.1
Application1The type is only valid for one specific application
Context-specific2Meaning of this type depends on the context (such as within a sequence, set or choice)
Private3Defined in private specifications
P/CValueDescription
Primitive (P)0The contents octets directly encode the element value.
Constructed (C)1The contents octets contain 0, 1, or more element encodings.

Long form[edit]

If the tag number is be too large for the 5-bit tag field, it has to be encoded in further octets.

The initial octet encodes the class and primitive/constructed as before, and bits 1–5 are 1.The tag number is encoded in the following octets, where bit 8 of each is 1 if there are more octets, and bits 1–7 encode the tag number.The tag number bits combined, big-endian, encode the tag number.The least number of following octets should be encoded; that is, bits 1–7 should not all be 0 in the first following octet.

Length octets[edit]

There are two forms of the length octets: The definite form and the indefinite form.

First length octet
FormBits
87654321
Definite, short0Length (0–127)
Indefinite10
Definite, long1Number of following octets (1–126)
Reserved1127

Definite form[edit]

This encodes the number of content octets and is always used if the type is primitive or constructed and data are immediately available.There is a short form and a long form, which can encode different ranges of lengths.Numeric data is encoded as unsigned integers with the least significant bit always first (to the right).

The short form consists of a single octet in which bit 8 is 0, and bits 1–7 encode the length (which may be 0) as a number of octets.

The long form consist of 1 initial octet followed by 1 or more subsequent octets, containing the length.In the initial octet, bit 8 is 1, and bits 1–7 (excluding the values 0 and 127) encode the number of octets that follow.[1]The following octets encode, as big-endian, the length (which may be 0) as a number of octets.

Long form example, length 435
Octet 1Octet 2Octet 3
100000100000000110110011
Long form2 length octets435 content octets

Indefinite form[edit]

This does not encode the length at all, but that the content octets finish at marker octets.This applies to constructed types and is typically used if the content is not immediately available at encoding time.

It consists of single octet, in which bit 8 is 1, and bits 1–7 are 0. Then, 2 end-of-contents octets must terminate the content octets.

Contents octets[edit]

The contents octets encode the element data value.[1]

Note that there may be no contents octets (hence, the element has a length of 0) if only the existence of the ASN.1 object, or its emptiness, is to be noted.For example, this is the case for an ASN.1 NULL value.

CER encoding[edit]

CER (Canonical Encoding Rules) is a restricted variant of BER for producing unequivocal transfer syntax for data structures described by ASN.1. Whereas BER gives choices as to how data values may be encoded, CER (together with DER) selects just one encoding from those allowed by the basic encoding rules, eliminating rest of the options. CER is useful when the encodings must be preserved; e.g., in security exchanges.

DER encoding[edit]

DER (Distinguished Encoding Rules) is a restricted variant of BER for producing unequivocal transfer syntax for data structures described by ASN.1. Like CER, DER encodings are valid BER encodings. DER is the same thing as BER with all but one sender's options removed.

DER is a subset of BER providing for exactly one way to encode an ASN.1 value. DER is intended for situations when a unique encoding is needed, such as in cryptography, and ensures that a data structure that needs to be digitally signed produces a unique serialized representation. DER can be considered a canonical form of BER. For example, in BER a Boolean value of true can be encoded as any of 255 non-zero byte values, while in DER there is one way to encode a boolean value of true.

The most significant DER encoding constraints are:

  1. Length encoding must use the definite form
    • Additionally, the shortest possible length encoding must be used
  2. Bitstring, octetstring, and restricted character strings must use the primitive encoding
  3. Elements of a Set are encoded in sorted order, based on their tag value

DER is widely used for digital certificates such as X.509.

BER, CER and DER compared[edit]

The key difference between the BER format and the CER or DER formats is the flexibility provided by the Basic Encoding Rules. BER, as explained above, is the basic set of encoding rules given by ITU-T X.690 for the transfer of ASN.1 data structures. It gives senders clear rules for encoding data structures they want to send, but also leaves senders some encoding choices. As stated in the X.690 standard, 'Alternative encodings are permitted by the basic encoding rules as a sender's option. Receivers who claim conformance to the basic encoding rules shall support all alternatives'.[1]

A receiver must be prepared to accept all legal encodings in order to legitimately claim BER-compliance. By contrast, both CER and DER restrict the available length specifications to a single option. As such, CER and DER are restricted forms of BER and serve to disambiguate the BER standard.

CER and DER differ in the set of restrictions that they place on the sender. The basic difference between CER and DER is that DER uses definitive length form and CER uses indefinite length form in some precisely defined cases. That is, DER always has leading length information, while CER uses end-of-contents octets instead of providing the length of the encoded data. Because of this, CER requires less metadata for large encoded values, while DER does it for small ones.

In order to facilitate a choice between encoding rules, the X.690 standards document provides the following guidance:

The distinguished encoding rules is more suitable than the canonical encoding rules if the encoded value is small enough to fit into the available memory and there is a need to rapidly skip over some nested values. The canonical encoding rules is more suitable than the distinguished encoding rules if there is a need to encode values that are so large that they cannot readily fit into the available memory or it is necessary to encode and transmit a part of a value before the entire value is available. The basic encoding rules is more suitable than the canonical or distinguished encoding rules if the encoding contains a set value or set-of value and there is no need for the restrictions that the canonical and distinguished encoding rules impose.

Criticisms of BER encoding[edit]

There is a common perception of BER as being 'inefficient' compared to alternative encoding rules. It has been argued by some that this perception is primarily due to poor implementations, not necessarily any inherent flaw in the encoding rules.[3] These implementations rely on the flexibility that BER provides to use encoding logic that is easier to implement, but results in a larger encoded data stream than necessary. Whether this inefficiency is reality or perception, it has led to a number of alternative encoding schemes, such as the Packed Encoding Rules, which attempt to improve on BER performance and size.

Other alternative formatting rules, which still provide the flexibility of BER but use alternative encoding schemes, are also being developed. The most popular of these are XML-based alternatives, such as the XML Encoding Rules and ASN.1 SOAP.[4] In addition, there is a standard mapping to convert an XML Schema to an ASN.1 schema, which can then be encoded using BER.[5]

Usage[edit]

Despite its perceived problems, BER is a popular format for transmitting data, particularly in systems with different native data encodings.

  • The SNMP and LDAP protocols specify ASN.1 with BER as their required encoding scheme.
  • The EMV standard for credit and debit cards uses BER to encode data onto the card
  • The digital signature standard PKCS #7 also specifies ASN.1 with BER to encode encrypted messages and their digital signature or digital envelope.
  • Many telecommunication systems, such as ISDN, toll-free call routing, and most cellular phone services use ASN.1 with BER to some degree for transmitting control messages over the network.
  • GSM TAP (Transferred Account Procedures), NRTRDE (Near Real Time Roaming Data Exchange) files are encoded using BER. [1]

By comparison, the more definite DER encoding is widely used to transfer digital certificates such as X.509.

See also[edit]

  • Packed Encoding Rules (PER, X.691)
  • Structured Data eXchange Format (SDXF)

References[edit]

This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the 'relicensing' terms of the GFDL, version 1.3 or later.

  1. ^ abcdInformation technology – ASN.1 encoding rules: Specification of Basic Encoding Rules (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules (DER), ITU-T X6.90, 07/2002
  2. ^http://itu.int/ITU-T/X.680
  3. ^Lin, Huai-An. “Estimation of the Optimal Performance of ASN.1/BER Transfer Syntax”. ACM Computer Communication Review. July 93, 45 - 58.
  4. ^ITU-T Rec. X.892, ISO/IEC 24824-2
  5. ^ITU-T X.694, ISO/IEC ISO/IEC 8825-5

External links[edit]

  • jASN1 Open source Java ASN.1 BER/DER coding library by beanit
  • PHPASN1 PHP ASN.1 BER encoding/decoding library at github, GPL-licensed
  • ASN1js JavaScript ASN.1 BER encoding/decoding library at github, GPL-licensed
Retrieved from 'https://en.wikipedia.org/w/index.php?title=X.690&oldid=894908927'
Posted :