Content sniffing (original) (raw)

Content sniffing, also known as media type sniffing or MIME sniffing, is the practice of inspecting the content of a byte stream to attempt to deduce the file format of the data within it. Content sniffing is generally used to compensate for a lack of accurate metadata that would otherwise be required to enable the file to be interpreted correctly. Content sniffing techniques tend to use a mixture of techniques that rely on the redundancy found in most file formats: looking for file signatures and magic numbers, and heuristics including searching for well-known representative substrings, the use of byte frequency and n-gram tables, and Bayesian inference.

Property	Value
dbo:abstract	Content sniffing, also known as media type sniffing or MIME sniffing, is the practice of inspecting the content of a byte stream to attempt to deduce the file format of the data within it. Content sniffing is generally used to compensate for a lack of accurate metadata that would otherwise be required to enable the file to be interpreted correctly. Content sniffing techniques tend to use a mixture of techniques that rely on the redundancy found in most file formats: looking for file signatures and magic numbers, and heuristics including searching for well-known representative substrings, the use of byte frequency and n-gram tables, and Bayesian inference. MIME (Multipurpose Internet Mail Extensions) sniffing was, and still is, used by some web browsers, including notably Microsoft's Internet Explorer, in an attempt to help web sites which do not correctly signal the MIME type of web content display. However, doing this opens up a serious security vulnerability, in which, by confusing the MIME sniffing algorithm, the browser can be manipulated into interpreting data in a way that allows an attacker to carry out operations that are not expected by either the site operator or user, such as cross-site scripting. Moreover, by making sites which do not correctly assign MIME types to content appear to work correctly in those browsers, it fails to encourage the correct labeling of material, which in turn makes content sniffing necessary for these sites to work, creating a vicious circle of incompatibility with web standards and security best practices. A specification exists for media type sniffing in HTML5, which attempts to balance the requirements of security with the need for reverse compatibility with web content with missing or incorrect MIME-type data. It attempts to provide a precise specification that can be used across implementations to implement a single well-defined and deterministic set of behaviors. The UNIX file command can be viewed as a content sniffing application. (en)
dbo:wikiPageExternalLink	http://deletethis.net/dave/%3Fq=mime-sniffing%7Ctitle=Mime-sniffing%7Cauthor=David https://mimesniff.spec.whatwg.org/ http://tools.ietf.org/html/draft-abarth-mime-sniff-06 http://tools.ietf.org/html/draft-masinter-mime-web-info-00 https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Content-Type-Options
dbo:wikiPageID	36425928 (xsd:integer)
dbo:wikiPageLength	5479 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID	1124164573 (xsd:integer)
dbo:wikiPageWikiLink	dbr:Bayesian_inference dbr:Mojibake dbr:N-gram dbr:Byte_stream dbr:Character_encoding dbr:UTF-7 dbr:Letter_frequency dbc:Computer_file_formats dbc:Web_technology dbr:Cross-site_scripting dbr:MIME_type dbc:Heuristics dbr:Web_browser dbr:ASCII dbr:File_(command) dbr:File_format dbr:Redundancy_(information_theory) dbr:HTML dbr:HTML5 dbr:Internet_Explorer dbr:Internet_Explorer_7 dbr:JScript dbc:Web_security_exploits dbr:Heuristic dbr:IETF dbr:Metadata dbr:Microsoft dbr:Browser_sniffing dbr:MIME dbr:Magic_number_(programming) dbr:Codepage dbr:Security_vulnerability
dbp:wikiPageUsesTemplate	dbt:Cite_web dbt:Mono dbt:Reflist dbt:See_also dbt:Short_description
dct:subject	dbc:Computer_file_formats dbc:Web_technology dbc:Heuristics dbc:Web_security_exploits
gold:hypernym	dbr:Practice
rdf:type	owl:Thing dbo:Company yago:WikicatComputerFileFormats yago:WikicatWebSecurityExploits yago:Abstraction100002137 yago:Accomplishment100035189 yago:Act100030358 yago:Action100037396 yago:Activity100407535 yago:Communication100033020 yago:Event100029378 yago:Feat100036762 yago:Format106636806 yago:Heuristic105847956 yago:Information106634376 yago:Message106598915 yago:Procedure101023820 yago:PsychologicalFeature100023100 yago:WikicatHeuristics yago:YagoPermanentlyLocatedEntity yago:Rule105846932
rdfs:comment	Content sniffing, also known as media type sniffing or MIME sniffing, is the practice of inspecting the content of a byte stream to attempt to deduce the file format of the data within it. Content sniffing is generally used to compensate for a lack of accurate metadata that would otherwise be required to enable the file to be interpreted correctly. Content sniffing techniques tend to use a mixture of techniques that rely on the redundancy found in most file formats: looking for file signatures and magic numbers, and heuristics including searching for well-known representative substrings, the use of byte frequency and n-gram tables, and Bayesian inference. (en)
rdfs:label	Content sniffing (en)
rdfs:seeAlso	dbr:Charset_detection
owl:sameAs	freebase:Content sniffing wikidata:Content sniffing https://global.dbpedia.org/id/4iP15 yago-res:Content sniffing
prov:wasDerivedFrom	wikipedia-en:Content_sniffing?oldid=1124164573&ns=0
foaf:isPrimaryTopicOf	wikipedia-en:Content_sniffing
is dbo:wikiPageRedirects of	dbr:Charset_sniffing dbr:Mime_sniffing dbr:MIME_sniffing
is dbo:wikiPageWikiLink of	dbr:Media_type dbr:WHATWG dbr:Charset_detection dbr:Browser_sniffing dbr:Charset_sniffing dbr:Mime_sniffing dbr:MIME_sniffing
is foaf:primaryTopic of	wikipedia-en:Content_sniffing