RFC 2047 encoding is specifically designed for non-ASCII characters in email header fields that contain human-readable text, such as the subject line, or the friendly-name part of the From, To, and Cc headers. It is not intended for encoding the actual email address within these headers. While some email clients or systems might tolerate a fully encoded From header, this practice is not compliant with RFC standards and can lead to delivery issues, particularly with stricter receivers like Gmail, which may reject such messages or silently remove the encoding.
Key findings
RFC 2047 purpose: RFC 2047 permits encoding of non-ASCII characters in specific parts of email headers that are meant for human readability. This includes the Subject field, comments, and the display name (friendly-from) preceding an address in fields like From, To, and Cc.
Email address exclusion: The RFC explicitly limits where encoded words can be used, specifically excluding the actual local-part and domain of an email address. Attempting to encode the entire email address is considered invalid, even if some systems process it.
Gmail's strictness: Gmail is known to be particularly strict regarding RFC 5322 compliance. Messages with a fully RFC 2047 encoded From header, including the email address itself, are often rejected with an RFC 5322 non-compliant error. This behavior aligns with proper RFC interpretation.
Friendly-from vs. address: The correct approach is to encode only the friendly-from portion of the From header, leaving the actual email address in plain ASCII. This ensures broad compatibility and compliance.
Security implications: Encoding the entire email address can pose a security risk. It could potentially hide malicious addresses that appear legitimate, for instance, through the use of homoglyphs (unicode characters that look like ASCII characters), making phishing attempts harder to detect.
Key considerations
RFC compliance: Always adhere to the specific guidelines outlined in RFC 2047 (for encoded words) and RFC 5322 (for message format) to ensure optimal email deliverability and avoid compliance errors. Understanding these nuances is crucial for troubleshooting RFC compliance issues.
Header encoding: When using non-ASCII characters in subject lines or friendly-from names, ensure only these human-readable parts are RFC 2047 encoded. The email address itself (e.g., local-part@domain.com) must remain in its original ASCII or internationalized email address format, as defined by other RFCs (like RFC 4952 for internationalized email addresses).
Preventing errors: Incorrect encoding of email addresses can lead to various delivery failures, including Gmail's 'sender's email address uses abnormal characters' error. It's important to understand what causes this error and other related issues.
What email marketers say
Email marketers often encounter encoding challenges when dealing with non-ASCII characters in email headers. While they aim for broad compatibility and engaging display names, some practices, if not strictly compliant with RFCs, can lead to deliverability problems, especially with major inbox providers like Gmail. There's a common consensus that the friendly display name can be encoded, but the actual email address should remain unencoded ASCII or use specific internationalized email address formats.
Key opinions
Common issue: Many marketers report facing deliverability issues when their email headers, particularly the From field, contain non-standard encoding for the actual email address.
Friendly-from vs. address: There's a general understanding that only the display name (friendly-from) should be encoded, while the email address itself should not be. This practice helps avoid issues like Gmail blocking emails with unicode characters in the From address.
ISP variations: Marketers note that some ISPs are more forgiving with non-compliant encoding than others, leading to inconsistent delivery experiences.
Debugging complexity: Troubleshooting encoding issues can be complex, especially when the RFCs don't immediately highlight the specific restriction on email address encoding within RFC 2047.
Key considerations
Testing is key: Regularly test email rendering and deliverability across different inbox providers to catch any encoding-related issues before they impact large campaigns. This is particularly important for fields like the subject line, where invalid characters can cause problems.
Understanding RFCs: Familiarize yourself with relevant RFCs to avoid common pitfalls, even if it seems like minor technical detail. A deeper dive into what RFC 5322 says versus what actually works can be very beneficial.
Deliverability impact: Incorrect encoding can negatively affect inbox placement, potentially increasing spam scores. This is a crucial aspect of overall email deliverability and compliance with email standards.
Marketer view
Email marketer from Email Geeks indicates that encoding the entire From header value, including the email address itself, with RFC 2047, is generally accepted by many email systems. However, this practice often leads to rejections from stricter providers like Gmail, which seem to demand the actual address remains in plain text.
08 Mar 2024 - Email Geeks
Marketer view
Email marketer from Free Support Forum, Aspose.com, reports that upgrading their email software led to encoding issues, causing some emails to score higher as spam. This highlights how software changes can inadvertently affect email encoding and deliverability, emphasizing the need for careful configuration.
15 Apr 2023 - Free Support Forum - aspose.com
What the experts say
Experts in email deliverability consistently emphasize the importance of strict adherence to RFCs for optimal performance and avoiding blocklists. They highlight that RFC 2047 is not a blanket solution for encoding all parts of an email header but is specifically for human-readable text. Encoding the actual email address is a non-compliant practice that can trigger spam filters, lead to deliverability failures, and even pose security risks.
Key opinions
RFC 2047 limitations: Experts confirm that RFC 2047 encoding is strictly for human-readable text parts of headers, such as subject lines and friendly names in address fields, not for the email address itself.
Validity vs. functionality: Just because some software accepts an invalid encoding (e.g., an encoded email address) does not make it RFC compliant. This can lead to intermittent delivery issues and false positives.
Security concern: Encoding the actual email address is widely seen as a security risk because it can facilitate phishing by obscuring malicious addresses that might look legitimate (e.g., homoglyph attacks).
Internationalization: While non-ASCII characters can exist in email addresses, they require a different, specific encoding for internationalized email addresses (IDNs), not RFC 2047.
Silent issues: Some mailbox providers, like Gmail, might silently remove invalid encoding from headers when viewing original messages, making it difficult for senders to diagnose deliverability problems related to non-compliance.
Key considerations
RFC 2047 scope: Always remember that RFC 2047 applies to specific header fields or parts of fields where unencoded URLs impact deliverability. It's not for core routing information like the email address.
Avoid pitfalls: Relying on the behavior of lenient receiving systems rather than strict RFC compliance can lead to unexpected deliverability issues down the line. It's better to conform to standards upfront.
Reference authoritative sources: When in doubt, consult the official RFCs for precise rules on encoding and message formatting, such as RFC 2047 itself, to ensure compliance and avoid unexpected delivery failures.
Expert view
Deliverability expert from Email Geeks states that RFC 2047 encoding is exclusively for human-readable text. This means it applies to fields like Subject lines and the friendly comments within From and To headers, but not to the core email address itself.
08 Mar 2024 - Email Geeks
Expert view
Email deliverability expert from Word to the Wise notes that encoding actual email addresses, while sometimes functional due to lenient software, is never considered a valid practice under the RFC standards. This non-compliance can lead to unexpected delivery failures.
10 Mar 2024 - Word to the Wise
What the documentation says
Official documentation, particularly the Request for Comments (RFCs) that define internet standards, provides the authoritative guidelines for email message formatting and encoding. RFC 2047, a key component of MIME, outlines specific rules for encoding non-ASCII text in email headers. These documents clearly delineate which parts of a header can be encoded and how, emphasizing that the actual email address is not subject to this type of encoding, reserving it for human-readable display elements.
Key findings
RFC 2047 scope: RFC 2047 describes techniques for encoding non-ASCII text within various portions of an RFC 822 (now RFC 5322) message header, specifically those intended to be displayed to users.
Limited applicability: Section 5 of RFC 2047 details the specific header fields where 'encoded-word' syntax may replace portions of an 'atom' or 'text'. This includes Subject, Comments, and the 'display-name' part of address fields (e.g., From, To, Cc).
Email address format: RFC 5322, Section 3.4, specifies the format of address fields, clearly separating the 'display-name' (which can be encoded per RFC 2047) from the 'angle-addr' (the actual email address, local-part@domain), which generally must remain in ASCII or use specific internationalized formats.
Internationalized email addresses: RFC 4952 introduces specifications for fully supporting internationalized email addresses (IDNs), indicating that non-ASCII characters in the actual email address require a different framework than RFC 2047 encoding.
Key considerations
Precision in encoding: Documentation underscores the importance of precise application of encoding. Using RFC 2047 where it's not permitted, such as encoding the full email address, is a direct violation of the standard.
Adherence to RFCs: For maximum compatibility and deliverability, it's crucial to strictly follow the guidelines set forth in the relevant RFCs (e.g., RFC 2047 and RFC 5322) rather than relying on the permissive behavior of some systems.
Consequences of non-compliance: Non-compliance, even if seemingly minor, can result in emails being rejected or flagged as spam, affecting email deliverability rates significantly.
Technical article
IETF RFC 2047 documentation specifies that this memo describes techniques to allow the encoding of non-ASCII text in various portions of an RFC 822 message header, particularly where human-readable text is expected.
Nov 1996 - IETF Datatracker
Technical article
IETF RFC 4952 documentation introduces a series of specifications that define mechanisms and protocol extensions needed to fully support internationalized email addresses. This indicates that a separate framework exists for non-ASCII characters in the actual address.