To find out why our mascot for this release has a pitchfork and more on nerdy naming, read below the fold. For the summary of Braize’s (3.4) major new features (including one surprise feature that appeared mid-roadmap), here’s the highlights:
You’ll notice the theme of this release has been major improvements in decompilation, we’re really excited with the quality of improvements for the first three major features described above and they’re joined by several other important improvements as well.
So why is the mascot for this release Binjy with a pitchfork? And what is Braize anyway? In case you missed our last announcement, we’ve begun code-naming releases after Sci-Fi/Fantasy planets and in this case Braize is one of the lesser known planets in Brandon Sanderson’s Cosmere, but despite its reputation as a prison/hellscape planet in that universe, as huge nerds, we couldn’t resist including it early in our naming scheme. In this case, super-powered heroes were released from torment there, just like 3.4 has been released into the world with some super powers of its own! It’s also fitting that “C” is Coruscant, with tomorrow being May the 4th! Feel free to send us your suggestions for “E” and beyond.
While major releases get all the press, even our point releases include a huge amount of changes. The first three can produce significantly better decompilation results across a variety of binaries and platforms.
It will take more than one new feature to solve C++ reverse engineering, but one of the biggest obstacles has been resolved with this latest feature: the core type system now supports inherited types. So what does that mean? For starters, the UIs and APIs around creating structures have changed. Now when you use s
to create a structure in linear view or the type sidebar, you’ll see many new options:
The updated type system now lets you assign Base Structures, whose members will be inherited by your structure. You can avoid having to create duplicate members for every class in a hierarchy, and cross references will be propagated up the inheritance chain. Base Structures can also be located at an offset, supporting structures with multiple inherited bases.
For virtual function table-like classes, there is now the ability to propagate data variable references to the structure members. Looking at the cross references to a structure member will then include any annotated data variables using that structure. Simply put, if you mark up virtual tables, you can follow their functions from where they are used to what values they could have.
What really matters is what this allows you to do. Consider the following simple example:
struct A __packed
{
struct vtable_for_A* vtable;
int32_t x;
};
struct __base(A, 0) B
{
struct vtable_for_B* vtable;
__inherited int32_t A::x;
int32_t y;
};
struct __base(B, 0) C
{
struct vtable_for_C* vtable;
__inherited int32_t A::x;
__inherited int32_t B::y;
int32_t z;
};
You can create these types in the above UI, or you can use the notation shown there as we’ve extended the clang type parser to include the __inherited
keyword.
Notice that the virtual tables support inheritance as well!
struct __data_var_refs __ptr_offset(0x10) vtable_for_A __packed
{
void* field_0;
void* typeinfo;
int32_t (* sum)(struct A* this);
};
struct __base(vtable_for_A, 0) __data_var_refs __ptr_offset(0x10) vtable_for_B
{
__inherited void* vtable_for_A::field_0;
__inherited void* vtable_for_A::typeinfo;
__inherited int32_t (* vtable_for_A::sum)(struct A* this);
};
struct __base(vtable_for_B, 0) __data_var_refs __ptr_offset(0x10) vtable_for_C
{
__inherited void* vtable_for_A::field_0;
__inherited void* vtable_for_A::typeinfo;
void (* sum)(struct A* this);
};
The beauty of this is that, when the types are applied, you get cross-references across the entire binary for not only the base class, but all the children as well. Check out the cross references for A’s virtual table versus B’s virtual table in the following sample file:
All this combines to make a very powerful type system for object oriented languages like C++. For more details and an even more complicated example showing multiple inheritance, check out the C++ Types help document.
One thing to keep in mind is that compilers don’t always follow the layout you expect! While debug info formats like PDBs contain the true data, your types may not always copy/paste directly from source code, so creating and applying the appropriate types may require some manual effort (for now!).
Yes, we’ve talked about it before, but not only is the feature now enabled by default, it has had some important upgrades! To recap, this feature identifies instances where either a function call like strcpy
might have been inlined and turns it back into an outlined function, or even just identifies situations where memory setting operations might have been unrolled (think initializing a stack array or structure to zero) and replaces them with a more convenient to a memset
operation.
You can tell the feature is enabled by looking in your analyzed binary view for a .synthetic_builtins
section:
This fake section is created and appended to your existing memory layout, as a place to store function prototypes for these built in functions. Then we re-write analysis to point to those built-in methods, replacing anywhere they are inlined. This makes it easy to select one of the function prototypes to identify all the places in the binary where it was applied via cross references:
In this new release, outlining can identify stack and global memory being used for similar patterns. It avoids some false positives by following type alignment constraints, adds supports for UTF-16 wide strings, and will also outline memcpy
and memset
for some well-known native architecture instructions such as rep stos*
and rep movs*
.
This new feature can dramatically improve decompilation. As you create variables, the analysis will create memset
calls based on your types, automatically converting sequences of assignments to more concise outlined calls. It may happen in some unexpected cases, but properly defining your types should improve results.
Next on the theme of decompiler improvements, we’ve enhanced our switch statement recovery to make it much more robust in the face non-contiguous values! In practice, that means you’ll see nested groups of if-statements recovered as switch statements:
How often does this happen? We weren’t sure ourselves so we took a few hundred random binaries from the the Decompiler Explorer (Dogbolt), and compared the switch statement recovery before and after:
Total Switch Statements 3.3 | 861 |
---|---|
Total Switch Statements 3.4 | 1023 |
Improvement | 18.82% |
Binaries Scanned | 184 |
Binaries Improved | 53 |
Turns out, what we thought was a small ability to better handle sparse switch statements resulted in almost a 20% improvement in recovering switch statements!
Have you ever wanted to copy data from a previous BNDB into a new one? That process just got a whole lot easier with the new Import from BNDB dialog.
The Import from BNDB feature imports types from a previous BNDB into your currently open file. In addition, it applies types for matching symbols in functions and variables. Import BNDB will not port symbols from a BNDB with symbols to one without – the names must already match. To first match similar functions without symbols and port symbols over, consider using BinDiff, whose binexport plugin works with Binary Ninja.
For this release, we’ve focused on providing additional support for deploying the Enterprise server in different environments. We’ve also done a round of updates on the user experience of managing an Enterprise server. And, finally, we’ve addressed some annoying performance issues and some reported bugs.
Up until now, deploying an Enterprise server has been easy, simple, and fast…but, also somewhat inflexible. We are still working to address this inflexibility in future releases, but beta support for cloud deployments on Azure and deployments with Podman (which improves our Red Hat compatibility) have landed in this release. We’ve also changed the way we handle SSL, and now operate more cleanly behind edge routers, web application firewalls, and reverse proxies.
One of the major changes we’ve made server-side is moving Enterprise user data into an object store. For customers that are deploying the Enterprise server on local hardware, we provide one and migrate your data automatically. But, for customers that have a cloud deployment and want to store their data with their cloud provider, we can now support external S3-compatible object stores.
In the client, we’ve improved network and UI performance for files with a large amount of snapshots and projects with a large number of files. We’ve also addressed a number of edge-cases that could lead to errors or crashes.
Thanks to Xusheng’s hard work, the debugger has really flourished since the last stable release, and we’ve heard from many users that they’ve been leveraging it with good results. The full list of debugger improvements is below, but the headliner is the ability to debug dynamic libraries. You’ll want to use the debug adapter settings menu option item to specify the binary launcher (for example, rundll32
), distinct from the “input file” which would be the DLL or shared library itself:
Experimental features are those that are disabled by default but still available for users to try out with some limitations. Two features that didn’t quite make the cut-off for reliability and performance are still available for early testing in this release.
The new Components UI gives you a new way to organize your analysis. You can use it to your symbols into folders based on names, compiler units, or anything your heart desires! If you like scripting, there is a convenient API to enable you to write your own automatic organization. While performance for small and medium-sized binaries is good, for those working with large binaries that have large numbers of functions or global variables (on the order of >100k), the performance of the Components UI isn’t quite up to snuff yet. That said, we encourage everyone to give it a try! You can enable it with the ui.components.enabled
setting.
MH_FILESET
is a new Mach-O feature in recent macOS and iOS binaries where kernel extensions and libraries are collected into a single large file. The new support for these files can extract these modules without needing an external tool. This experimental feature is enabled by opening a Mach-O file using “Open with Options” and enabling the loader.macho.processFileset
setting:
Note that this feature is disabled by default due to some crashes that have not yet been resolved. If robust MH_FILESET
support is important for your workflow, make sure to switch to the dev branch as we expect improvements to the feature on that branch after the 3.4 release.
Special thanks to op2786, WhatTheFuzz, Joe Rozner, and mkrasnitski, for your contributions to our open source components during this release cycle!
current_*_ssa
to the python consoleRestart
from Restart and Reopen
help()
in the python console now works on Windows.init_array
__mod_init_func
sectionchar
signednessIsConstantData
API now available on C++ RegisterValue
API__bool__
for BinaryView, Segment, and Section classesMetadataChoiceDialog
__version__
property of the python APIvisit
methods on IL instructions.TypeLibrary::Finalize()
not returns true
on successsplitPane
return the created paneGetSymbolsByRawName
in the C++ APIinteraction::get_form_input
BinaryReader.seek
has new optional wence
parameterinstruction_iterator.py
example to work headlesslyinstall_api.py
script has been refactored with new optionsLogWarn
instead of LogError
GetMinimumCoreABIVersion
while loading to avoid… shenanigansset_comment_at
in the scripting consoleget_instruction_info
when returning the wrong typeget_string_at
APIbv.get_symbol_at
when Namespace was not None and symbol was an auto symbolInstructionTextToken
default constructor not initializing membersScriptingProviders.update_locals
raising exceptions causing console entry to failConfidence::operator==
was broken__len__
of Segment objects in the python APIFunction.function_type
is deprecated in favor of Function.type
Function.explicitly_defined_type
is deprecated in favor of Function.has_explicitly_defined_type
UMULH
/SMULH
instructionsDebuggerController.register_event_callback
does not workget_bundle
command to the management interface as another way to obtain a copy of the offline install bundle--latest-only
flag to the client_updates download
command in the management interface, which will download Enterprise client updates for only the latest stable
and dev
channel buildsinstall
command now states which version has been installed or updated to--legacy-format
if you need the old one for clients older than version 3.0)Project.upload_new_file
APIbackup
, restore
, and change_password
management interface commands…and, of course, all of the usual “many miscellaneous crashes and fixes” not explicitly listed here. Check them out in our full milestone list: https://github.com/Vector35/binaryninja-api/milestone/18?closed=1