Pe file format rva




















We are not going to say what is each field, but we will talk about some of them that could be important:. The loader can follow or not this value to load the binary. SectionAlignment : alignment of the sections when are loaded in memory. FileAlignment : alignment used to align the raw data of sections in the image file on disk. Value should be a power of 2 between and 64k. Default is 0x SizeOfImage : size in bytes of the binary in memory, including all the headers. So this value must be multiple of SectionAlignment.

Why so? All, the bytes of data that were before the first. They can be called, "meta section data". Since they are not loaded into VA space of process. Then when you try to read 0x in process A, you'll get the content which is located on 0x of physical memory. Regarding RVA, it's simply designed to ease relocation. When loading relocable modules eg, DLL the system will try to slide it through process memory space. So in file layout it puts a "relative" address to help calculation.

Usually the RVA in image files is relative to process base address when being loaded into memory, but some RVA may be relative to the "section" starting address in image or object files you have to check the PE format spec for detail. They are usually relative to some VA, which may be a default loading base address or section base VA - that's why I say you must check the PE format spec for detail.

Since the sections start at different base, RVA may become different when crossing sections. Usually we won't discuss the "RVA" before sections, but the PE header will still be loaded until the end of section headers. Gap between section header and section body if any won't be loaded. You can examine that by debuggers. Moreoever, when there's some gap between sections, they may be not loaded. When you read thet PE format spec you may find some "RVA" which is relative to some special address like resource starting address.

Why 0x? What you've missed is the concept of "section" in PE loading stage. The PE may contain several "sections", each section maps to a new starting VA address. For example, this is dumped from win7 kernel The sections should be continuous when being loaded into memory i.

In order to provide memory mapping, these sections must still be aligned by some size file alignment size - decide by linker. Offset means "offset to physical PE file beginning". So the headers occupy 0x bytes of file and 0x when being mapped to memory , which is the offset of section 1.

Then by aligning its data c44c1 bytes , we get physsize C As mentioned earlier, there are two versions of the Optional Header, one for bit executables and one for bit executables. The two versions are different in two aspects:. Magic : Microsoft documentation describes this field as an integer that identifies the state of the image, the documentation mentions three common values:.

Machine is ignored by the Windows PE loader. SizeOfCode : This field holds the size of the code. SizeOfInitializedData : This field holds the size of the initialized data.

SizeOfUninitializedData : This field holds the size of the uninitialized data. The program can provide one or more TLS callback functions to support additional initialization and termination for TLS data objects. A typical use for such a callback function would be to call constructors and destructors for objects.

Although there is typically no more than one callback function, a callback is implemented as an array to make it possible to add additional callback functions if desired.

If there is more than one callback function, each function is called in the order in which its address appears in the array. A null pointer terminates the array. It is perfectly valid to have an empty list no callback supported , in which case the callback array has exactly one member-a null pointer.

The Reserved parameter should be set to zero. The Reason parameter can take the following values:. Current versions of the Microsoft linker and Windows XP and later versions of Windows use a new version of this structure for bit xbased systems that include reserved SEH technology. This provides a list of safe structured exception handlers that the operating system uses during exception dispatching.

Otherwise, the operating system terminates the application. This helps prevent the "x86 exception handler hijacking" exploit that has been used in the past to take control of the operating system. The Microsoft linker automatically provides a default load configuration structure to include the reserved SEH data. If the user code already provides a load configuration structure, it must include the new reserved SEH fields. The data directory entry for a pre-reserved SEH load configuration structure must specify a particular size of the load configuration structure because the operating system loader always expects it to be a certain value.

In that regard, the size is really only a version check. For compatibility with Windows XP and earlier versions of Windows, the size must be 64 for x86 images. Delayload import table in its own.

Module contains suppressed export information. This also infers that the address taken IAT table is also present in the load config. Mask for the subfield that contains the stride of Control Flow Guard function table entries that is, the additional count of bytes per table entry.

Additionally, the Windows SDK winnt. Resources are indexed by a multiple-level binary-sorted tree structure. By convention, however, Windows uses three levels:. A series of resource directory tables relates all of the levels in the following way: Each directory table is followed by a series of directory entries that give the name or identifier ID for that level Type, Name, or Language level and an address of either a data description or another directory table. If the address points to a data description, then the data is a leaf in the tree.

If the address points to another directory table, then that table lists directory entries at the next level down. A leaf's Type, Name, and Language IDs are determined by the path that is taken through directory tables to reach the leaf.

The first table determines Type ID, the second table pointed to by the directory entry in the first table determines Name ID, and the third table determines Language ID. Each resource directory table has the following format. This data structure should be considered the heading of a table because the table actually consists of directory entries described in section 6.

The directory entries make up the rows of a table. Each resource directory entry has the following format. Whether the entry is a Name or ID entry is indicated by the resource directory table, which indicates how many Name and ID entries follow it remember that all the Name entries precede all the ID entries for the table. All entries for the table are sorted in ascending order: the Name entries by case-sensitive string and the ID entries by numeric value.

The resource directory string area consists of Unicode strings, which are word-aligned. These strings are stored together after the last Resource Directory entry and before the first Resource Data entry. This minimizes the impact of these variable-length strings on the alignment of the fixed-size directory entries.

Each resource directory string has the following format:. Each Resource Data entry describes an actual unit of raw data in the Resource Data area. A Resource Data entry has the following format:. CLR metadata is stored in this section. It is used to indicate that the object file contains managed code. The format of the metadata is not documented, but can be handed to the CLR interfaces for handling metadata.

The valid exception handlers of an object are listed in the. It contains the COFF symbol index of each valid handler, using 4 bytes per index. The COFF archive format provides a standard mechanism for storing collections of object files. These collections are commonly called libraries in programming documentation. The first 8 bytes of an archive consist of the file signature.

The rest of the archive consists of a series of archive members, as follows:. The first and second members are "linker members. Typically, a linker places information into these archive members. The linker members contain the directory of the archive. The third member is the "longnames" member. This optional member consists of a series of null-terminated ASCII strings in which each string is the name of another archive member. The rest of the archive consists of standard object-file members.

Each of these members contains the contents of one object file in its entirety. An archive member header precedes each member. The following list shows the general structure of an archive:. The archive file signature identifies the file type. Any utility for example, a linker that takes an archive file as input can check the file type by reading this signature.

Each member linker, longnames, or object-file member is preceded by a header. An archive member header has the following format, in which each field is an ASCII text string that is left justified and padded with spaces to the end of the field.

There is no terminating null character in any of these fields. Each member header starts on the first even address after the end of the previous archive member. The Name field has one of the formats shown in the following table. As mentioned earlier, each of these strings is left justified and padded with trailing spaces within a field of 16 bytes:.

The first linker member is included for backward compatibility. It is not used by current linkers, but its format must be correct. This linker member provides a directory of symbol names, as does the second linker member. For each symbol, the information indicates where to find the archive member that contains the symbol. The elements in the offsets array must be arranged in ascending order.

This fact implies that the symbols in the string table must be arranged according to the order of archive members. For example, all the symbols in the first object-file member would have to be listed before the symbols in the second object file.

Although both linker members provide a directory of symbols and archive members that contain them, the second linker member is used in preference to the first by all current linkers. The second linker member includes symbol names in lexical order, which enables faster searching by name. The longnames member is a series of strings of archive member names.

A name appears here only when there is insufficient room in the Name field 16 bytes. The longnames member is optional. It can be empty with only a header, or it can be completely absent without even a header. The strings are null-terminated. Each string begins immediately after the null byte in the previous string. Traditional import libraries, that is, libraries that describe the exports from one image for use by another, typically follow the layout described in section 7, Archive Library File Format.

The primary difference is that import library members contain pseudo-object files instead of real ones, in which each member includes the section contributions that are required to build the import tables that are described in section 6. The section contributions for an import can be inferred from a small set of information. The linker can either generate the complete, verbose information into the import library for each member at the time of the library's creation or write only the canonical information to the library and let the application that later uses it generate the necessary data on the fly.

This is sufficient information to accurately reconstruct the entire contents of the member at the time of its use. This structure is followed by two null-terminated strings that describe the imported symbol's name and the DLL from which it came.

These values are used to determine which section contributions must be generated by the tool that uses the library if it must access that data. The null-terminated import symbol name immediately follows its associated import header.

The following values are defined for the Name Type field in the import header. They indicate how the name is to be used to generate the correct symbols that represent the import:.

Several attribute certificates are expected to be used to verify the integrity of the images. However, the most common is Authenticode signature. To accomplish this task, Authenticode signatures contain something called a PE image hash.

The Authenticode PE image hash, or file hash for short, is similar to a file checksum in that it produces a small value that relates to the integrity of a file. A checksum is produced by a simple algorithm and is used primarily to detect memory failures.

That is, it is used to detect whether a block of memory on disk has gone bad and the values stored there have become corrupted. However, unlike most checksum algorithms, it is very difficult to modify a file so that it has the same file hash as its original unmodified form.

That is, a checksum is intended to detect simple memory failures that lead to corruption, but a file hash can be used to detect intentional and even subtle modifications to a file, such as those introduced by viruses, hackers, or Trojan horse programs.

In an Authenticode signature, the file hash is digitally signed by using a private key known only to the signer of the file. A software consumer can verify the integrity of the file by calculating the hash value of the file and comparing it to the value of signed hash contained in the Authenticode digital signature.

If the file hashes do not match, part of the file covered by the PE image hash has been modified. It is not possible or desirable to include all image file data in the calculation of the PE image hash.

Sometimes it simply presents undesirable characteristics for example, debugging information cannot be removed from publicly released files ; sometimes it is simply impossible. For example, it is not possible to include all information within an image file in an Authenticode signature, then insert the Authenticode signature that contains that PE image hash into the PE image, and later be able to generate an identical PE image hash by including all image file data in the calculation again, because the file now contains the Authenticode signature that was not originally there.

This section describes how a PE image hash is calculated and what parts of the PE image can be modified without invalidating the Authenticode signature.

The PE image hash for a specific file can be included in a separate catalog file without including an attribute certificate within the hashed file. This is relevant, because it becomes possible to invalidate the PE image hash in an Authenticode-signed catalog file by modifying a PE image that does not actually contain an Authenticode signature. All data in sections of the PE image that are specified in the section table are hashed in their entirety except for the following exclusion ranges:.

The file CheckSum field of the Windows-specific fields of the optional header. This checksum includes the entire file including any attribute certificates in the file.

In all likelihood, the checksum will be different than the original value after inserting the Authenticode signature. Information related to attribute certificates. The areas of the PE image that are related to the Authenticode signature are not included in the calculation of the PE image hash because Authenticode signatures can be added to or removed from an image without affecting the overall integrity of the image. This is not a problem, because there are user scenarios that depend on re-signing PE images or adding a time stamp.

Authenticode excludes the following information from the hash calculation:. The Certificate Table and corresponding certificates that are pointed to by the Certificate Table field listed immediately above. To calculate the PE image hash, Authenticode orders the sections that are specified in the section table by address range, then hashes the resulting sequence of bytes, passing over the exclusion ranges.

Information past of the end of the last section. The area past the last section defined by highest offset is not hashed. This area commonly contains debug information. Debug information can generally be considered advisory to debuggers; it does not affect the actual integrity of the executable program.

It is quite literally possible to remove debug information from an image after a product has been delivered and not affect the functionality of the program. In fact, this is sometimes done as a disk-saving measure.

It is worth noting that debug information contained within the specified sections of the PE Image cannot be removed without invaliding the Authenticode signature. You can use the makecert and signtool tools provided in the Windows Platform SDK to experiment with creating and verifying Authenticode signatures. For more information, see Reference, below. Creating, Viewing, and Managing Certificates.

Kernel-Mode Code Signing Walkthrough. ImageHlp Functions. Skip to main content. This browser is no longer supported. Download Microsoft Edge More info. Contents Exit focus mode. Is this page helpful? Please rate your experience Yes No. Any additional feedback? Note This document is provided to aid in the development of tools and applications for Windows but is not guaranteed to be a complete specification in all respects.

Note Statically declared TLS data objects can be used only in statically loaded image files. Note The PE image hash for a specific file can be included in a separate catalog file without including an attribute certificate within the hashed file. In this article. A certificate that is used to associate verifiable statements with an image. A number of different verifiable statements can be associated with a file; one of the most useful ones is a statement by a software manufacturer that indicates what the message digest of the image is expected to be.

A message digest is similar to a checksum except that it is extremely difficult to forge. Therefore, it is very difficult to modify a file to have the same message digest as the original file. The statement can be verified as being made by the manufacturer by using public or private key cryptography schemes. This document describes details about attribute certificates other than to allow for their insertion into image files. In most cases, the format of each stamp is the same as that used by the time functions in the C run-time library.

The location of an item within the file itself, before being processed by the linker in the case of object files or the loader in the case of image files. In other words, this is a position within the file as stored on disk. A file that is given as input to the linker. The linker produces an image file, which in turn is used as input by the loader. The term "object file" does not necessarily imply any connection to object-oriented programming.

A description of a field that indicates that the value of the field must be zero for generators and consumers must ignore the field. In an image file, this is the address of an item after it is loaded into memory, with the base address of the image file subtracted from it.

The RVA of an item almost always differs from its position within the file on disk file pointer. In an object file, an RVA is less meaningful because memory locations are not assigned. In this case, an RVA would be an address within a section described later in this table , to which a relocation is later applied during linking. For simplicity, a compiler should just set the first RVA in each section to zero. For example, all code in an object file can be combined within a single section or depending on compiler behavior each function can occupy its own section.

With more sections, there is more file overhead, but the linker is able to link in code more selectively. A section is similar to a segment in Intel architecture. All the raw data in a section must be loaded contiguously. In addition, an image file can contain a number of sections, such as.

Same as RVA, except that the base address of the image file is not subtracted. The address is called a VA because Windows creates a distinct VA space for each process, independent of physical memory. For almost all purposes, a VA should be considered just an address. A VA is not as predictable as an RVA because the loader might not load the image at its preferred location. The number that identifies the type of target machine. For more information, see Machine Types. The number of sections.

This indicates the size of the section table, which immediately follows the headers. This value should be zero for an image because COFF debugging information is deprecated. The number of entries in the symbol table. This data can be used to locate the string table, which immediately follows the symbol table. The size of the optional header, which is required for executable files but not for object files. This value should be zero for an object file. For a description of the header format, see Optional Header Image Only.

The flags that indicate the attributes of the file. For specific flag values, see Characteristics. This indicates that the file does not contain base relocations and must therefore be loaded at its preferred base address. If the base address is not available, the loader reports an error. The default behavior of the linker is to strip base relocations from executable EXE files.

Image only. This indicates that the image file is valid and can be run. If this flag is not set, it indicates a linker error. COFF symbol table entries for local symbols have been removed. This flag is deprecated and should be zero. Aggressively trim working set. This flag is deprecated for Windows and later and must be zero. The image file is a dynamic-link library DLL. Such files are considered executable files for almost all purposes, although they cannot be directly run.

The unsigned integer that identifies the state of the image file. The most common number is 0x10B, which identifies it as a normal executable file. The size of the code text section, or the sum of all code sections if there are multiple sections. The size of the initialized data section, or the sum of all such sections if there are multiple data sections. The address of the entry point relative to the image base when the executable file is loaded into memory.

For program images, this is the starting address. For device drivers, this is the address of the initialization function. An entry point is optional for DLLs. When no entry point is present, this field must be zero.

The address that is relative to the image base of the beginning-of-code section when it is loaded into memory. The address that is relative to the image base of the beginning-of-data section when it is loaded into memory. The preferred address of the first byte of image when loaded into memory; must be a multiple of 64 K.

The default for DLLs is 0x The alignment in bytes of sections when they are loaded into memory. It must be greater than or equal to FileAlignment. The default is the page size for the architecture. The alignment factor in bytes that is used to align the raw data of sections in the image file. The value should be a power of 2 between and 64 K, inclusive. The default is The size in bytes of the image, including all headers, as the image is loaded in memory.

It must be a multiple of SectionAlignment. The image file checksum. The following are checked for validation at load time: all drivers, any DLL loaded at boot time, and any DLL that is loaded into a critical Windows process.



0コメント

  • 1000 / 1000