优化AI智能分析模块

优化入口参数以及描述信息优化配置文件相关的参数信息优化部分代码执行逻辑优化README.md内容修复自动下载进度问题
4 years ago · ef6fcfd4b6
parent fc51a1f360
commit ef6fcfd4b6
13 changed files with 420 additions and 168 deletions
--- a/README.md
+++ b/README.md
@ -1,26 +1,50 @@
 该项目目前仅仅是规划项目中的冰山一角，如果您对此项目感兴趣或者想参与后继项目的开发工作或者翻译工作中，请发送邮件至blsm@vip.qq.com说明你的能力和诉求。
 ### AppInfoScanner
-一款适用于(Android、iOS、WEB、H5、静态网站)，信息检索的工具，可以帮助渗透测试人员快速获取App或者WEB中的有用资产信息。
+一款适用于以HW行动/红队/渗透测试团队为场景的移动端(Android、iOS、WEB、H5、静态网站)信息收集扫描工具，可以帮助渗透测试工程师、攻击队成员、红队成员快速收集到移动端或者静态WEB站点中关键的资产信息并提供基本的信息输出,如：Title、Domain、CDN、指纹信息、状态信息等。
-### 适用场景
+### 前言
- 日常渗透测试中对APP中的URL地址、IP地址、关键信息进行采集
+- 本项目的开发者目前为个人开发者同时有自己的工作，新的功能或者需求会在闲暇时间进行开发，BUG会优先进行处理。
- 大型攻防演练场景中对APP中URL地址、IP地址、关键信息进行采集
+- 如果在使用中遇到问题或者有新的需求，请在[](https://github.com/kelvinBen/AppInfoScanner/issues)提交BUG反馈，提交BUG前请先阅读最后的"常见问题"。
- 对WEB网站源代码进行URL地址、IP地址、关键信息进行采集(可以是开源的代码也可以是右击网页源代码另存为)
+- 如果您觉得这个项目对您有用，请点击本项目右上角的"star"按钮。
 - 如果您想持续跟进新的版本情况，请点击本项目右上角的"Watch"按钮。
 - 如果您想参与本项目的开发，请点击本项目右上角的"Fork"按钮,否则请勿点击"Fork"按钮。
 ### 免责声明
 请勿将本项目技术或代码应用在恶意软件制作、软件著作权/知识产权盗取或不当牟利等**非法用途**中。实施上述行为或利用本项目对非自己著作权所有的程序进行数据嗅探将涉嫌违反《中华人民共和国刑法》第二百一十七条、第二百八十六条，《中华人民共和国网络安全法》《中华人民共和国计算机软件保护条例》等法律规定。本项目提及的技术仅可用于私人学习测试等合法场景中，任何不当利用该技术所造成的刑事、民事责任均与本项目作者无关。
 ### 适用场景
 - 日常渗透测试中对APP中进行关键资产信息收集，比如URL地址、IP地址、关键字等信息的采集等。
 - 大型攻防演练场景中对APP中进行关键资产信息收集，比如URL地址、IP地址、关键字等信息的采集等。
 - 对WEB网站源代码进行URL地址、IP地址、关键字等信息进行采集等(可以是开源的代码也可以是右击网页源代码另存为)。
 - 对H5页面进行进行URL地址、IP地址、关键字等信息进行采集等。
 - 对某个APP进行定相信息收集等
 ### 功能介绍:
- 支持目录批量扫描
+- [x] 支持目录级别的批量扫描
- 支持DEX、APK、IPA、HTML、JS、Smali等文件的静态资源采集
+- [x] 支持DEX、APK、IPA、MACH-O、HTML、JS、Smali、ELF等文件的信息收集
- 支持自定义扫描规则
+- [x] 支持APK、IPA、H5等文件自动下载并进行一键信息收集
- 支持IP地址信息采集
+- [x] 支持自定义请求头、请求报文、请求方法
- 支持URL地址信息采集
+- [x] 支持规则自定义，随心自定义扫描规则
- 支持中间件信息采集
+- [x] 支持自定义忽略资源文件
- 支持多线程
+- [x] 支持自定义配置Android壳规则
- 支持忽略资源文件采集
+- [x] 支持自定义配置中间件规则
- 支持Android包名采集
+- [x] 支持Android加固壳、iPA官方壳的检测
 - [x] 支持IP地址、URL地址、中间件(json组件和xml组件)的信息采集
 - [x] 支持Android对应包名下内容的采集
 - [x] 支持网络嗅探功能，可以提供基本的信息输出
 - [x] 支持Windows系统、MacOS系统、*nux系列的系统
 - [x] 具备简单的AI识别功能，可以快速过滤三方URL地址
 - [ ] 添加国际化语言包
 - [ ] 一键对APK文件进行自动修复
 - [ ] 识别到壳后自动进行脱壳处理
 ### 部分截图
 ![](result.png)
 ### 环境说明
 - Apk文件解析需要使用JAVA环境,JAVA版本1.8及以下
 - Python3的运行环境
@ -29,133 +53,320 @@
 AppInfoScanner
    |-- libs  程序的核心代码
        |-- core
            |-- __init__.py 全局配置信息
            |-- parses.py 用于解析文件中的静态信息
            |-- download.py 用于自动下载APP或者H5页面
            |-- net.py 用于进行网络嗅探，并获取基本信息
        |-- task
-            |-- base_task.py 统一任务调度
+            |-- __init__.py 目录初始化文件
- 			|-- android_task.py 用于处理Android相关的文件            
+            |-- base_task.py 统一任务调度中心
-			 |-- ios_task.py 用于处理iOS相关的文件
+ 			|-- android_task.py 用于处理Android相关的任务
-            |-- web_task.py 用于处理Web相关的文件，比如网页右键源代码、H5相关的静态信息
+            |-- download_task.py 用于处理自动下载APP或者H5的任务            
 			 |-- ios_task.py 用于处理iOS相关的任务
            |-- net_task.py 用于处理网络嗅探相关任务
            |-- web_task.py 用于处理Web相关的任务，比如网页右键源代码、H5相关的静态信息
    |-- tools 程序需要依赖的工具
-        |-- apktool.jar 用于反编译apk文件
+        |-- apktool.jar 用于反编译apk文件，不同平台可能需要进行自我切换
-        |-- baksmali.jar 用于反编译dex文件
+        |-- baksmali.jar 用于反编译dex文件，不同平台可能需要进行自我切换
        |-- strings.exe 用于windows 32下获取iPA的字符串信息
        |-- strings64.exe 用于windows 64的系统获取iPA的字符串信息
-    |-- app.py 主运行程序
+    |-- __init__.py 目录初始化文件 
-    |-- config.py 用于自定义相关规则
+    |-- app.py 主运行程序
-    |-- readme.md  程序使用说明
+    |-- config.py 整个程序的配置文件
    |-- README.md  程序使用说明
    |-- requirements.txt 程序中需要安装的依赖库
    |-- update.md 程序历史版本信息
 ```
 ### 使用说明
 1. 下载
 ```
    git clone https://github.com/kelvinBen/AppInfoScanner.git
    或者复制以下链接到浏览器下载最新正式版本
    https://github.com/kelvinBen/AppInfoScanner/releases/latest
 ```
 2. 安装依赖库
 ```
    cd AppInfoScanner
    python3 -m pip install -r requirements.txt
 ```
-### Android 相关操作说明
+3. 运行(基础版)
-#### 扫描指定的apk
+- 扫描Android应用的APK文件、DEX文件、需要下载的APK文件下载地址、保存需要扫描的文件的目录
 ```
-python3 app.py android -i <Your apk file>  
+    python3 app.py android -i <Your APK File or DEX File or APK Download Url or Save File Dir>
 ```
-#### 扫描指定的dex
+- 扫描iOS应用的IPA文件、Mach-o文件、需要下载的IPA文件下载地址、保存需要扫描的文件目录
 ```
-python3 app.py android -i <Your dex file> 
+    python3 app.py ios -i <Your IPA file or Mach-o File or IPA Download Url or Save File Dir>
 ```
-#### 扫描一个目录下所有的APK或者dex
+- 扫描Web站点的文件、目录、需要缓存的站点URl
 ```
-python3 app.py android -i <Your apk or dex directory> 
+    python3 app.py web -i <Your Web file or Save Web Dir or Web Cache Url>
 ```
-#### 扫描指定关键字(临时)
+### 进阶操作指南
 #### 基本命令格式
 ```
-python3 app.py android -i <Your apk or dex directory> -r <the keyword>
+python3 app.py [TYPE] [OPTIONS] <The URL or directory to scan>
 ```
-#### 扫描的时候忽略资源文件
+#### 符号信息说明
 ```
-python3 app.py android -i <Your apk or dex directory> -n
+<> 代表需要扫描的文件或者目录或者URL地址
 | 或的关系，只能选择一个
 [] 代表需要输入的参数
 ```
-#### 扫描指定包下的内容
+#### TYPE参数详细说明
 此参数类型对应基本命令格式中的[TYPE],目前仅支持[android/ios/web]三种类型形式，三种类型形式必须指定一个。
 ```
 android: 用于扫描Android应用相关的文件的内容
 ios: 用于扫描iOS应用相关的文件内容
 web: 用于扫描WEB站点或者H5相关的文件内容
 ```
-python3 app.py android -i <Your apk or dex directory> -p <package1.package2>
+
 支持自动根据后缀名称进行修正，即便输入的是ios，实际上-i 输入的参数的文件名为XXX.apk，则会执行android相关的扫描。
 #### OPTIONS参数详细说明
 该参数类型对应基本命令格式中的[OPTIONS]，支持多个参数共同使用
 ```
 -i 或者 --inputs: 输入需要进行扫描的文件、目录或者需要自动下载的文件URL地址，如果路径过长请加"进行包裹，此参数为必填项。
 -r 或者 --rules: 输入需要扫描文件内容的临时扫描规则。
 -s 或者 --sniffer: 开启网络嗅探功能，默认为开启状态。
 -n 或者 --no-resource: 忽略所有的资源文件，包含网络嗅探功能中的资源文件(需要先在config.py中配置sniffer_filter相关规则)，默认为不忽略资源。
 -a 或者 --all: 输出所有符合扫描规则的结果集合，默认为开启状态。
 -t 或者 --threads: 设置线程并发数量，默认为10个线程并发。
 -o 或者 --output: 指定扫描结果和扫描过程中产生的临时文件的输出目录，默认为脚本所在的目录。
 -p 或者 -- package: 指定Android的APK文件或者DEX文件需要扫描的JAVA包名信息。此参数只能在android类型下使用。
 ```
 #### 具体使用方法
 ##### Android相关基本操作
 - 对本地APK文件进行扫描
 ```
 python3 app.py android -i <Your apk file>  
-#### 扫描所有的字符串
+例:
 python3 app.py android -i  C:\Users\Administrator\Desktop\Demo.apk
 ```
-python3 app.py android -i <Your apk or dex directory>  -a
+
 - 对本地Dex文件进行扫描
 ```
 python3 app.py android -i <Your DEX file>  
-#### 指定线程数量
+例:
 python3 app.py android -i  C:\Users\Administrator\Desktop\Demo.dex
 ```
-python3 app.py android -i <Your apk or dex directory> -t 10
+- 对URL地址中包含的APK文件进行扫描
 ```
 python3 app.py android -i <APK Download Url>  
 例:
-### iOS 相关操作说明
+python3 app.py android -i "https://127.0.0.1/Demo.apk" 
-#### 扫描指定的iPA文件
+```
 需要注意此处如果URL地址过长需要使用双引号(")进行包裹
 ##### iOS相关基本操作
 - 对本地IPA文件进行扫描
 ```
 python3 app.py ios -i <Your ipa file>
 例:
 python3 app.py ios -i "C:\Users\Administrator\Desktop\Demo.ipa" 
 ```
 - 对本地Macho文件进行扫描
 ```
 python3 app.py ios -i <Your Mach-o file>
 例:
 python3 app.py ios -i "C:\Users\Administrator\Desktop\Demo\Payload\Demo.app\Demo" 
 ```
 - 对URL地址中包含的IPA文件进行扫描
 ```
 python3 app.py ios -i <IPA Download Url>  
 例:
 python3 app.py ios -i "https://127.0.0.1/Demo.ipa" 
 ```
 需要注意此处如果URL地址过长需要使用双引号(")进行包裹,暂时不支持对Apple Store中的IPA文件进行扫描
-#### 扫描指定关键字(临时)
+##### Web相关基本操作
 - 对本地WEB站点进行扫描
 ```
-python3 app.py ios -i <Your ipa file> -r  <the keyword>
+python3 app.py web -i <Your web file>
 例:
 python3 app.py web -i "C:\Users\Administrator\Desktop\Demo.html" 
 ```
 - 对URL地址中包含的WEB站点文件进行扫描
 ```
 python3 app.py web -i <Web Download Url>  
-#### 扫描的时候忽略资源文件
+例:
 python3 app.py web -i "https://127.0.0.1/Demo.html" 
 ```
-python3 app.py ios -i <Your ipa file> -n
+
 ##### 具有共同性的操作
 以下操作均以android类型为例：
 - 对一个本地的目录进行扫描
 ```
 python3 app.py android -i <Your Dir>
-#### 输出所有的字符串
+例：
 python3 app.py android -i C:\Users\Administrator\Desktop\Demo
 ```
-python3 app.py ios -i <Your ipa file> -a
+
 - 添加临时规则或者关键字
 ```
 python3 app.py android -i <Your apk> -r <the keyword | the rules>
-#### 指定线程数量
+例：
 添加对百度域名的扫描
 python3 app.py android -i C:\Users\Administrator\Desktop\Demo.apk -r ".*baidu.com.*"
 ```
-python3 app.py ios -i <Your ipa file> -t 10
+
 - 关闭网络嗅探功能
 ```
 python3 app.py android -i <Your apk> -s
-### Web 相关操作说明
+例：
 python3 app.py android -i C:\Users\Administrator\Desktop\Demo.apk -s
 #### 扫描指定的Web网站目录或者html相关文件
 ```
-python3 app.py web -i <Your website directory> 
+- 忽略所有的资源文件
 ```
 python3 app.py android -i <Your apk> -n
-#### 扫描指定关键字(临时)
+例：
 python3 app.py android -i C:\Users\Administrator\Desktop\Demo.apk -n
 ```
-python3 app.py web -i  <Your website directory> -r <the keyword>
+
 - 关闭输出所有符合扫描规则内容的功能
 ```
 python3 app.py android -i <Your apk> -a
 例：
-#### 输出所有的字符串
+python3 app.py android -i C:\Users\Administrator\Desktop\Demo.apk -a
 ```
-python3 app.py web -i  <Your website directory> -a
+
 - 设置并发数量
 ```
 python3 app.py android -i <Your apk> -t 20
 例：
 设置20个并发线程
 python3 app.py android -i C:\Users\Administrator\Desktop\Demo.apk -t 20 
 ```
 - 指定结果集和缓存文件输出目录
 ```
 python3 app.py android -i <Your apk> -o <output path>
-#### 指定线程数量
+例：
 比如输出到桌面的Temp目录
 python3 app.py android -i C:\Users\Administrator\Desktop\Demo.apk -o C:\Users\Administrator\Desktop\Temp
 ```
-python3 app.py web -i <Your website directory> -t 10
+
 - 对指定包名下的文件内容进行扫描，该功能仅支持android类型
 ```
 python3 app.py android -i <Your apk> -p <Java package name>
 例：
 比如需要过滤com.baidu包名下的内容
 python3 app.py android -i C:\Users\Administrator\Desktop\Demo.apk -p "com.baidu"
 ```
 ### 高级版使用说明
 该项目中的程序仅作为一个基本的架子，会内置一些基本的规则，并不是每一个输入的内容都可以完成相关的扫描工作。所以可以根据自己的需要进行相关规则的配置，优秀的配置内容可以达到质的的效果。
 - 配置文件路径为 根目录下的config.py文件，即README.md的同级目录
 #### 配置项说明
 ```
 filter_components: 此配置项用于配置相关组件内容，包括Json组件或者XML组件等
 filter_strs: 用于配置需要进行扫描的文件内容，比如需要扫描端口号，则配置为："r'.*://([\d{1,3}\.]{3}\d{1,3}).*'"
 filter_no: 用于忽略扫描文件中不想要的内容
 shell_list: 用于配置Android相关的壳特征
 web_file_suffix: 此处配置需要进行扫描的WEB文件后缀名称
 sniffer_filter: 此处用于配置需要进行忽略网络嗅探的文件后缀名称
 headers: 用于配置自动下载过程中需要的请求头信息
 data: 用于配置自动下载过程中需要的请求报文体
 method: 用于配置自动下载过程中需要的请求方法
 ```
 ### 常见问题
- 1. 信息检索垃圾数据过多？
+####  1. 信息检索垃圾数据过多？
 ```
 方法1： 根据实际情况调整config.py中的规则信息
 方法2： 忽略资源文件
 ```
 #### 2. 出现错误：Error: This application has shell, the retrieval results may not be accurate, Please remove the shell and try again!
-> 方法1： 根据实际情况调整config.py中的规则信息
+说明需要扫描的应用存在壳，需要进行脱壳/砸壳以后才能进行扫描，目前可以结合以下工具进行脱壳/砸壳处理
-> 方法2： 忽略资源文件
+```
    Android:
        xposed模块： dexdump
        frida模块： FRIDA-DEXDump
    iOS:
        firda模块：
            windows系统使用： frida-ipa-dump
            MacOS系统使用：frida-ios-dump
 ```
 #### 3. 出现错误: File download failed! Please download the file manually and try again.
 文件下载失败。
 ```
 1) 请检查输入的URL地址是否正确
 2）请检查网络是否存在问题或者在配置文件config.py中配置请求头信息(headers)、请求报文体(data)、请求方法(method)保存后重新再执行。
 ```
 #### 4. 出现错误：Decompilation failed, please submit error information at https://github.com/kelvinBen/AppInfoScanner/issues"
 文件反编译失败。
 ```
 请将错误截图以及对应的APK文件提交至 https://github.com/kelvinBen/AppInfoScanner/issues，作者看到后会及时进行处理。
 ```
--- a/app.py
+++ b/app.py
@ -3,7 +3,6 @@
 # Author: kelvinBen
 # Github: https://github.com/kelvinBen/AppInfoScanner
 import click
 from libs.core import Bootstrapper
@ -15,61 +14,57 @@ def cli():
 # 创建Android任务
@cli.command(help="Get the key information of Android system.")
-@click.option("-i", "--inputs", required=True, type=str, help="Input APK file or DEX directory.")
+@click.option("-i", "--inputs", required=True, type=str, help="Please enter the APK file or DEX file to be scanned or the corresponding APK download address.")
-@click.option("-r", "--rules", required=False, type=str, default="", help="Add regular search rule.")
+@click.option("-r", "--rules", required=False, type=str, default="", help="Please enter a rule for temporary scanning of file contents.")
-@click.option("-s", "--net-sniffer", is_flag=True, default=True, help="Whether to enable network sniffing.")
+@click.option("-s", "--sniffer", is_flag=True, default=False, help="Enable the network sniffer function. It is on by default.")
-@click.option("-n", '--no-resource', is_flag=True, default=False,help="Ignore resource files.")
+@click.option("-n", '--no-resource', is_flag=True, default=False,help="Ignore all resource files, including network sniffing. It is not enabled by default.")
-@click.option("-p", '--package',required=False,type=str,default="",help="Specifies the retrieval package name.")
+@click.option("-a", '--all',is_flag=True, default=False,help="Output the string content that conforms to the scan rules.It is on by default.")
-@click.option("-a", '--all-str',is_flag=True, default=False,help="Output all strings.")
+@click.option("-t", '--threads',required=False, type=int,default=10,help="Set the number of concurrency. The larger the concurrency, the faster the speed. The default value is 10.")
-@click.option("-t", '--threads',required=False, type=int,default=10,help="Set the number of threads to 10 by default")
+@click.option("-o", '--output',required=False, type=str,default=None,help="Specify the result set output directory.")
-def android(inputs: str, rules: str, net_sniffer: bool,no_resource:bool,package:str,all_str:bool,threads:int) -> None:
+@click.option("-p", '--package',required=False,type=str,default="",help="Specifies the package name information that needs to be scanned.")
 def android(inputs: str, rules: str, sniffer: bool, no_resource:bool, all:bool, threads:int, output, package:str) -> None:
    try:
-        # 初始化全局对象
+        bootstrapper = Bootstrapper(__file__, output, all, no_resource)
        bootstrapper = Bootstrapper(__file__)
        bootstrapper.init()
-        BaseTask("Android", inputs, rules, net_sniffer, no_resource, package, all_str, threads).start()
+        BaseTask("Android", inputs, rules, sniffer, threads, package).start()
    except Exception as e:
        raise e
@cli.command(help="Get the key information of iOS system.")
-@click.option("-i", "--inputs", required=True, type=str, help="Input IPA file or ELF file.")
+@click.option("-i", "--inputs", required=True, type=str, help="Please enter IPA file or ELF file to scan or corresponding IPA download address. App store is not supported at present.")
-@click.option("-r", "--rules", required=False, type=str, default="", help="Add regular search rule.")
+@click.option("-r", "--rules", required=False, type=str, default="", help="Please enter a rule for temporary scanning of file contents.")
-@click.option("-s", "--net-sniffer", is_flag=True, default=True, help="Whether to enable network sniffing.")
+@click.option("-s", "--sniffer", is_flag=True, default=False, help="Enable the network sniffer function. It is on by default.")
-@click.option("-n", '--no-resource', is_flag=True, default=False,help="Ignore resource files.")
+@click.option("-n", '--no-resource', is_flag=True, default=False,help="Ignore all resource files, including network sniffing. It is not enabled by default.")
-@click.option("-a", '--all-str',is_flag=True, default=False,help="Output all strings.")
+@click.option("-a", '--all',is_flag=True, default=False,help="Output the string content that conforms to the scan rules.It is on by default.")
-@click.option("-t", '--threads',required=False, type=int,default=10,help="Set the number of threads to 10 by default")
+@click.option("-t", '--threads',required=False, type=int,default=10,help="Set the number of concurrency. The larger the concurrency, the faster the speed. The default value is 10.")
-def ios(inputs: str, rules: str, net_sniffer: bool,no_resource:bool,all_str:bool,threads:int) -> None:
+@click.option("-o", '--output',required=False, type=str,default=None,help="Specify the result set output directory.")
 def ios(inputs: str, rules: str, sniffer: bool, no_resource:bool, all:bool, threads:int, output:str) -> None:
    try:
-        # 初始化全局对象
+        bootstrapper = Bootstrapper(__file__, output, all, no_resource)
        bootstrapper = Bootstrapper(__file__)
        bootstrapper.init()
-        BaseTask("iOS", inputs, rules, net_sniffer, no_resource, all_str, threads).start()
+        BaseTask("iOS", inputs, rules, sniffer, threads).start()
    except Exception as e:
        raise e
@cli.command(help="Get the key information of Web system.")
-@click.option("-i", "--inputs", required=True, type=str, help="Input WebSite dir.")
+@click.option("-i", "--inputs", required=True, type=str, help="Please enter the site directory or site file to scan or the corresponding site download address.")
-@click.option("-r", "--rules", required=False, type=str, default="", help="Add regular search rule.")
+@click.option("-r", "--rules", required=False, type=str, default="", help="Please enter a rule for temporary scanning of file contents.")
-@click.option("-a", '--all-str',is_flag=True, default=False,help="Output all strings.")
+@click.option("-s", "--sniffer", is_flag=True, default=False, help="Enable the network sniffer function. It is on by default.")
-@click.option("-t", '--threads',required=False, type=int,default=10,help="Set the number of threads to 10 by default")
+@click.option("-n", '--no-resource', is_flag=True, default=False,help="Ignore all resource files, including network sniffing. It is not enabled by default.")
-@click.option("-s", "--net-sniffer", is_flag=True, default=True, help="Whether to enable network sniffing.")
+@click.option("-a", '--all',is_flag=True, default=False,help="Output the string content that conforms to the scan rules.It is on by default.")
-def web(inputs: str, rules: str, all_str:bool,threads:int,net_sniffer) -> None:
+@click.option("-t", '--threads',required=False, type=int,default=10,help="Set the number of concurrency. The larger the concurrency, the faster the speed. The default value is 10.")
@click.option("-o", '--output',required=False, type=str,default=None,help="Specify the result set output directory.")
 def web(inputs: str, rules: str, sniffer: bool, no_resource:bool, all:bool, threads:int, output:str) -> None:
    try:
-        # 初始化全局对象
+        bootstrapper = Bootstrapper(__file__, output, all, no_resource)
        bootstrapper = Bootstrapper(__file__)
        bootstrapper.init()
-        BaseTask("Web", inputs, rules,all_str, net_sniffer,threads).start()
+        BaseTask("Web", inputs, rules, sniffer, threads).start()
    except Exception as e:
        raise e
 def main():
    cli()
--- a/config.py
+++ b/config.py
@ -83,6 +83,14 @@ web_file_suffix =[
    "py"
 ]
 # 配置需要忽略网络嗅探的文件后缀名,此处根据具体需求进行配置，默认为不过滤
 sniffer_filter=[
    # "jpg",
    # "png",
    # "jpeg",
    # "gif",
 ]
 # 配置自动下载Apk文件或者缓存HTML的请求头信息
 headers = {
    "User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:81.0) Gecko/20100101 Firefox/81.0",
--- a/libs/core/init.py
+++ b/libs/core/init.py
@ -6,7 +6,6 @@ import time
 import shutil
 import platform
 # smali 所在路径
 smali_path = ""
@ -25,9 +24,12 @@ output_path = ""
 # 下载完成标记
 download_flag = False
 # excel 起始行号
 excel_row = 0
 class Bootstrapper(object):
-    def __init__(self, path):
+    def __init__(self, path, out_path, all=False, no_resource= False):
        global smali_path
        global backsmali_path
        global apktool_path
@ -43,11 +45,22 @@ class Bootstrapper(object):
        global excel_row 
        global download_path
        global download_flag
        global out_dir
        global all_flag
        global resource_flag 
-        create_time = time.strftime("%Y%m%d%H%M%S", time.localtime())
+        all_flag = not all
        resource_flag = no_resource
        create_time = time.strftime("%Y%m%d%H%M%S", time.localtime())
        script_root_dir =  os.path.dirname(os.path.abspath(path))
        if out_path:
            out_dir = out_path
        else:
            out_dir = script_root_dir
        tools_dir = os.path.join(script_root_dir,"tools")
        output_path = os.path.join(out_dir,"out")
        history_path = os.path.join(script_root_dir,"history")
        if platform.system() == "Windows":
            machine2bits = {'AMD64':64, 'x86_64': 64, 'i386': 32, 'x86': 32}
@ -60,29 +73,32 @@ class Bootstrapper(object):
        else:
            strings_path ="strings"
        excel_row = 0
        backsmali_path = os.path.join(tools_dir,"baksmali.jar")
        apktool_path = os.path.join(tools_dir, "apktool.jar")
-        output_path = os.path.join(script_root_dir,"out")
+        download_path = os.path.join(out_dir,"download")
-        history_path = os.path.join(script_root_dir,"history")
+        txt_result_path = os.path.join(out_dir,"result_"+str(create_time)+".txt")
-        download_path = os.path.join(script_root_dir,"download")
+        xls_result_path = os.path.join(out_dir,"result_"+str(create_time)+".xls")
        txt_result_path = os.path.join(script_root_dir,"result_"+str(create_time)+".txt")
        xls_result_path = os.path.join(script_root_dir,"result_"+str(create_time)+".xls")
        app_history_path = os.path.join(history_path,"app_history.txt")
        domain_history_path = os.path.join(history_path,"domain_history.txt")
    def init(self):
        if not os.path.exists(out_dir):
            os.makedirs(out_dir)
            print("[*] Create directory {}".format(out_dir))
        if os.path.exists(output_path):
            shutil.rmtree(output_path)
        os.makedirs(output_path)
        print("[*] Create directory {}".format(output_path))
-        if os.path.exists(download_path):
+        if not os.path.exists(download_path):
-            shutil.rmtree(download_path)
+            # shutil.rmtree(download_path)
-        os.makedirs(download_path)
+            os.makedirs(download_path)
            print("[*] Create directory {}".format(download_path))
        if not os.path.exists(history_path):
            os.makedirs(history_path)
            print("[*] Create directory {}".format(history_path))
        if os.path.exists(txt_result_path):
            os.remove(txt_result_path)
--- a/libs/core/download.py
+++ b/libs/core/download.py
@ -3,6 +3,7 @@
 # Github: https://github.com/kelvinBen/AppInfoScanner
 import re
 import os
 import sys
 import time
 import config
 import requests
@ -38,7 +39,7 @@ class DownloadThreads(threading.Thread):
            if resp.status_code == requests.codes.ok:
                if self.types == "Android" or self.types == "iOS":
                    count = 0
-                    count_tmp = 0
+                    progress_tmp = 0
                    time1 = time.time()
                    length = float(resp.headers['content-length'])
                    with open(self.cache_path, "wb") as f:
@ -46,12 +47,12 @@ class DownloadThreads(threading.Thread):
                            if chunk:
                                f.write(chunk)
                                count += len(chunk)
-                                if time.time() - time1 > 1:
+                                progress = int(count / length * 100)
-                                    p = count / length * 100
+                                if progress != progress_tmp:
-                                    speed = (count - count_tmp) / 1024 / 1024 / 2
+                                    progress_tmp = progress
-                                    # count_tmp = count
+                                    print("\r", end="")
-                                    print(self.file_name + ': ' + formatFloat(p) + '%' + ' Speed: ' + formatFloat(speed) + 'M/S')
+                                    print("[*] Download progress: {}%: ".format(progress), "▋" * (progress // 2), end="")
-                                    time1 = time.time()
+                                    sys.stdout.flush()
                        f.close()
                else:
                    html = resp.html()
@ -59,10 +60,9 @@ class DownloadThreads(threading.Thread):
                        f.write(html)
                        f.close()
                cores.download_flag = True
-        except Exception:
+        except Exception as e:
            raise Exception(e)
            return
    def formatFloat(num):
        return '{:.2f}'.format(num)    
    def run(self):
        threadLock = threading.Lock()
--- a/libs/core/parses.py
+++ b/libs/core/parses.py
@ -10,13 +10,12 @@ import libs.core as cores
 class ParsesThreads(threading.Thread):
-    def __init__(self,threadID,name,file_queue,all,result_dict,types):
+    def __init__(self,threadID,name,file_queue,result_dict,types):
        threading.Thread.__init__(self) 
        self.file_queue = file_queue
        self.name = name
        self.threadID = threadID
        self.result_list = []
        self.all = all
        self.result_dict=result_dict
        self.types = types
@ -75,7 +74,8 @@ class ParsesThreads(threading.Thread):
                    continue
                self.threadLock.acquire()
-                print("[+] The string searched for matching rule is: %s" % (resl_str))
+                if cores.all_flag:
                    print("[+] The string searched for matching rule is: %s" % (resl_str))
                self.result_list.append(resl_str)
                self.threadLock.release()
            continue
@ -87,7 +87,7 @@ class ParsesThreads(threading.Thread):
        if len(resl_str) == 0:
            return 0
-        for filte in config.filter_no:
+        for filte in set(config.filter_no):
            resl_str = resl_str.replace(filte,"")
            if len(resl_str) == 0:
                return_flag = 0 
--- a/libs/task/android_task.py
+++ b/libs/task/android_task.py
@ -11,11 +11,9 @@ import libs.core as cores
 class AndroidTask(object):
-    def __init__(self,path,no_resource,package):
+    def __init__(self,path,package):
        self.path = path
        self.no_resource = no_resource
        self.package = package
        self.file_queue = Queue()
        self.shell_flag=False
        self.packagename=""
@ -93,7 +91,7 @@ class AndroidTask(object):
            if "smali" in file_name or "assets" in file_name:
                scanner_file_suffixs = ["smali","js","xml"]
-                if self.no_resource:
+                if cores.resource_flag:
                    scanner_file_suffixs =["smali"]
                self.__get_scanner_file__(file_path,scanner_file_suffixs)
--- a/libs/task/base_task.py
+++ b/libs/task/base_task.py
@ -18,18 +18,16 @@ class BaseTask(object):
    thread_list =[]
    result_dict = {}
    app_history_list=[]
-    
+    domain_history_list=[]
    # 统一初始化入口
-    def __init__(self, types="Android", inputs="", rules="", net_sniffer=True, no_resource=False, package="", all_str=False, threads=10):
+    def __init__(self, types="Android", inputs="", rules="", sniffer=True, threads=10, package=""):
        self.types = types
        self.net_sniffer = net_sniffer
        self.path = inputs
        if rules:
            config.filter_strs.append(r'.*'+str(rules)+'.*')
-        self.no_resource = no_resource
+        self.sniffer = not sniffer
        self.package = package
        self.all = all_str
        self.threads = threads
        self.package = package
        self.file_queue = Queue()
@ -41,7 +39,7 @@ class BaseTask(object):
        # 获取历史记录
        self.__history_handle__()
-        print("[*] The filtering rules obtained by AI are as follows: %s" % (config.filter_no) )
+        print("[*] The filtering rules obtained by AI are as follows: %s" % (set(config.filter_no)) )
        # 任务控制中心
        task_info = self.__tast_control__()
@ -55,7 +53,7 @@ class BaseTask(object):
        file_identifier = task_info["file_identifier"]
        if shell_flag:
-            print('\033[3;31m Error: This application has shell, the retrieval results may not be accurate, Please remove the shell and try again!')
+            print('[-] \033[3;31m Error: This application has shell, the retrieval results may not be accurate, Please remove the shell and try again!')
            return
        # 线程控制中心
@ -72,21 +70,21 @@ class BaseTask(object):
    def __tast_control__(self):
        task_info = {}
        # 自动根据文件后缀名称进行修正
        cache_info = DownloadTask().start(self.path,self.types)
        cacar_path = cache_info["path"]
        types = cache_info["type"]
-        if not os.path.exists(cacar_path) and cores.download_flag:
+
        if not (os.path.exists(cacar_path) and cores.download_flag):
            print("[-] File download failed! Please download the file manually and try again.")
            return task_info
        # 调用Android 相关处理逻辑
        if types == "Android":
-            task_info = AndroidTask(cacar_path,self.no_resource,self.package).start()
+            task_info = AndroidTask(cacar_path,self.package).start()
        # 调用iOS 相关处理逻辑
        elif types == "iOS":
-            task_info = iOSTask(cacar_path,self.no_resource).start()
+            task_info = iOSTask(cacar_path).start()
        # 调用Web 相关处理逻辑
        else:
            task_info = WebTask(cacar_path).start()
@ -94,17 +92,19 @@ class BaseTask(object):
    def __threads_control__(self,file_queue):
        for threadID in range(1,self.threads): 
-            name = "Thread - " + str(threadID)
+            name = "Thread - " + str(int(threadID))
-            thread =  ParsesThreads(threadID,name,file_queue,self.all,self.result_dict,self.types)
+            thread =  ParsesThreads(threadID,name,file_queue,self.result_dict,self.types)
            thread.start()
            self.thread_list.append(thread)
    def __print_control__(self,packagename,comp_list,file_identifier):
        txt_result_path = cores.txt_result_path
        xls_result_path = cores.xls_result_path
-        
+                
-        # 此处需要hash值或者应用名称, apk文件获取pachage, dex文件获取hash, macho-o获取文件名
+        if self.sniffer:
-
+            print("[*] ========= Sniffing the URL address of the search ===============")
            NetTask(self.result_dict,self.app_history_list,self.domain_history_list,file_identifier,self.threads).start()
        if packagename: 
            print("[*] =========  The package name of this APP is: ===============")
            print(packagename)
@ -114,12 +114,12 @@ class BaseTask(object):
            for json in comp_list:
                print(json)
-        if self.net_sniffer:
+        if cores.all_flag:
-            print("[*] ========= Sniffing the URL address of the search ===============")
+            print("[*] For more information about the search, see TXT file result: %s" %(cores.txt_result_path))
-            NetTask(self.result_dict,self.app_history_list,file_identifier,self.threads).start()
+
        if self.sniffer:
            print("[*] For more information about the search, see XLS file result: %s" %(cores.xls_result_path))
-        print("[*] For more information about the search, see TXT file result: %s" %(cores.txt_result_path))
+
    def __history_handle__(self):
        domain_history_path =  cores.domain_history_path
        app_history_path = cores.app_history_path
@ -141,8 +141,8 @@ class BaseTask(object):
                    cout = cout + 1
                for line in lines:
                    domain = line.replace("\r","").replace("\n","")
                    self.domain_history_list.append(domain)
                    domain_count = lines.count(line)
                    if domain_count >= cout:
                        config.filter_no.append(domain)
                f.close()
--- a/libs/task/download_task.py
+++ b/libs/task/download_task.py
@ -21,8 +21,15 @@ class DownloadTask(object):
            types = "iOS"
            file_name = create_time + ".ipa"
        else:
-            types = "WEB"
+            if types == "Android":
-            file_name = create_time + ".html"
+                types = "Android"
                file_name = create_time+ ".apk"
            elif types == "iOS":
                types = "iOS"
                file_name = create_time + ".ipa"
            else:
                types = "WEB"
                file_name = create_time + ".html"
        if not(path.startswith("http://") or path.startswith("https://")):
            if not os.path.isdir(path):
@ -35,5 +42,6 @@ class DownloadTask(object):
            thread = DownloadThreads(path,file_name,cache_path,types)
            thread.start()
            thread.join()
            print()
            return {"path":cache_path,"type":types}
--- a/libs/task/ios_task.py
+++ b/libs/task/ios_task.py
@ -13,10 +13,8 @@ from queue import Queue
 class iOSTask(object):
    elf_file_name = ""
-    def __init__(self,path,no_resource):
+    def __init__(self,path):
        self.path = path
        self.no_resource = no_resource
        self.file_queue = Queue()
        self.shell_flag = False
        self.file_identifier= []
@ -78,7 +76,7 @@ class iOSTask(object):
                    self.__get_file_header__(dir_file_path)
                    self.file_queue.put(dir_file_path)
                    continue
-                if self.no_resource:    
+                if cores.resource_flag:    
                    dir_file_suffix =  dir_file.split(".")
                    if len(dir_file_suffix) > 1:
                        if dir_file_suffix[-1] in file_suffix:
--- a/libs/task/net_task.py
+++ b/libs/task/net_task.py
@ -5,6 +5,7 @@
 import re
 import xlwt
 import socket
 import config
 from queue import Queue
 import libs.core as cores
 from libs.core.net import NetThreads
@ -14,13 +15,14 @@ class NetTask(object):
    value_list = []
    domain_list=[]
-    def __init__(self,result_dict,app_history_list,file_identifier,threads):
+    def __init__(self,result_dict,app_history_list,domain_history_list,file_identifier,threads):
        self.result_dict = result_dict
        self.app_history_list = app_history_list
        self.file_identifier = file_identifier
        self.domain_queue = Queue()
-        self.threads = threads 
+        self.threads = int(threads) 
        self.thread_list = []
        self.domain_history_list = domain_history_list
    def start(self):
        xls_result_path = cores.xls_result_path
@ -55,13 +57,22 @@ class NetTask(object):
        with open(txt_result_path,"a+",encoding='utf-8',errors='ignore') as f:
            for key,value in self.result_dict.items():
-                f.write(key+"\r")
+                if cores.all_flag:
                    f.write(key+"\r")
                for result in value:
                    if result in self.value_list:
                        continue
                    self.value_list.append(result)
                    if cores.all_flag:
                        f.write("\t"+result+"\r")
                    if (("http://" in result) or ("https://" in result)) and ("." in result):
                        domain = result.replace("https://","").replace("http://","")
                        if "{" in result or "}" in result or "[" in result or "]" in result:
                            continue
                        if "/" in domain:
                            domain = domain[:domain.index("/")]
@ -70,11 +81,14 @@ class NetTask(object):
                        # 目前流通的域名中加上协议头最短长度为11位
                        if len(result) <= 10:
                            continue
-                        self.domain_queue.put({"domain":domain,"url_ip":result})
+                        
-
+                        url_suffix = result[result.rindex(".")+1:].lower()
                        if not(cores.resource_flag and url_suffix in config.sniffer_filter):
                            self.domain_queue.put({"domain":domain,"url_ip":result})
                        for identifier in self.file_identifier:
                            if identifier in self.app_history_list:
-                                if not(domain in self.domain_list):
+                                if not(domain in self.domain_history_list): 
                                    self.domain_list.append(domain)
                                    self.__write_content_in_file__(cores.domain_history_path,domain)
                                continue
@ -86,8 +100,7 @@ class NetTask(object):
                            if append_file_flag:
                                self.__write_content_in_file__(cores.app_history_path,identifier)
                                append_file_flag = False
-                    self.value_list.append(result)
+                    
                    f.write("\t"+result+"\r")
            f.close()
    def __start_threads__(self,worksheet):
--- a/result.png
+++ b/result.png
--- a/update.md
+++ b/update.md
@ -1,7 +1,12 @@
 ### V1.0.7
 - 新增文件自动下载功能，支持APK文件,非AppStore的IPA文件的下载以及对H5或者html页面进行缓存
 - 新增后缀信息修正功能，误填任务类型可以根据后缀名进行自我修正
-
+- 优化AI智能分析模块
 - 优化入口参数以及描述信息
 - 优化配置文件相关的参数信息
 - 优化部分代码执行逻辑
 - 优化README.md内容
 - 修复自动下载进度问题
 ### V1.0.6
 - 新增AI智能分析快速过滤第三方URL地址