Skip to content

Adjust the way of introducing JRE APIs and Introduce Cut-Shortcut into Tai-e#215

Open
for-just-we wants to merge 3 commits intopascal-lab:masterfrom
for-just-we:cut-shortcut
Open

Adjust the way of introducing JRE APIs and Introduce Cut-Shortcut into Tai-e#215
for-just-we wants to merge 3 commits intopascal-lab:masterfrom
for-just-we:cut-shortcut

Conversation

@for-just-we
Copy link

@for-just-we for-just-we commented Feb 16, 2026

Arguments exampe java -jar Tai-e-all.jar -cp <path/to/program> -m Main -lj <path/to/JREs> -java 8 -a "pta=cs:ci;solver:csc;dump-yaml:true;only-dump-app:true;

Where -lj <path/to/JREs> specifiy load JRE lib from <path/to/JREs> instead of hard-coded java-benchmarks/JRE, solver:csc in analysis option specify using Cut-Shortcut instead of default solver, only-dump-app:true means when dumping yaml analysis result, only results of application code will be dump.

1. Adjust the way of introducing JRE APIs

In AbstractWorldBuilder.getClassPath. Tai-e load JRE libraries from hardcoded path java-benchmarks/JREs. Which is hard to use when packaged into executable-jar. Here I add a new command option libJREPath that makes user can specify a customized path, default to java-benchmarks/JREs.

@JsonProperty
@Option(names = {"-lj", "--lib-jre"},
       description = "JRE library path, (default: ${DEFAULT-VALUE})",
       defaultValue = "java-benchmarks/JREs")
private String libJREPath;

2. Dump only the application-code point-to analysis results (May not be necessary)

Currently, when adding dump-ci:true/ dump-yaml:true, Tai-e will dump results for all variables. However, usually application code is a small part of the whole program. Hence I add a option only-dump-app in AnalysisOption (Tai-e-analysis.yml), default to false, when set to true, Tai-e will only dump application code results.

3. Introducing Cut-Shortcut

I downloaded the code from Cut-Shortcut artefact and integrated it into the latest Tai-e commit. Pass the 4 testcases provided in the following zip file. This implementation supports context-sensitive analysis.

Local flow and field access analysis follow the implementation of the paper artefact while container access differs. To make the analysis result sound, only PFG edge from return value of container/iterator-exit method with modeled type to their callsite will be cut. Container type such as ArrayList, HashMap is modeled, customized container types such as anonymous AbstractList is not considered, total API list is provided in container-config.

Testcases

CutShortcutTestcases.zip

Implementations that may be refined later

  1. Model of [Transfer] method.

Here Cut-Shortcut support following [Transfer] method. We follow raw artefact to match method name at callsite with keywords such as keySet, entrySet. Maybe solve this in onNewCallEdge will better? But it's hard to collect all relevant callee method signature.

graph LR

Col --iterator+transfer-->  COL_ITR --next+exit--> COL_VALUE

MAP --entrySet+transfer--> MAP_ENTRY_SET --iterator+transfer--> MAP_ENTRY_ITR --next+transfer--> MAP_ENTRY
MAP_ENTRY --getKey+exit--> MAP_KEY
MAP_ENTRY --getValue+exit--> MAP_VALUE

MAP --keySet+transfer--> MAP_KEY_SET --iterator+transfer--> MAP_KEY_ITR --next+exit --> MAP_KEY
MAP --values+transfer--> MAP_VALUES --iterator+transfer--> MAP_VALUE_ITR --next+exit --> MAP_VALUE
Loading
  1. Identification of Map.entrySet().iterator().next()

Although Map.entrySet()'s type is Set, and Map.entrySet().iterator().next() calls Set.iterator().next().
Since it's element type is Map$Entry, it is of [Transfer] instead of [Exit].

Hence in ContainerAccessHandler.CutReturnEdge. When cutting return value of Iterator.next(), we need to makes sure the element type is not Map$Entry. Here I collect all potential Map$Entry type variable by analyzing cast statements. For example, in following example, $r12 is a Map$Entry hence $r12 = invokeinterface $r11.<java.util.Iterator: java.lang.Object next()>(); is a [Transfer] call.

I guess there may be better [Transfer] analysis method than analyzing cast statements?

$r12 = invokeinterface $r11.<java.util.Iterator: java.lang.Object next()>();
r13 = (java.util.Map$Entry) $r12;

Issues

  1. PFG edge from return value of Map.keySet()/values() to their callsite receivers is marked with FlowKind.LOCAL_ASSIGN instead of FlowKind.RETURN. Hence CutShortcutSolver.needPropagateHost will mark they can propagate $\text{pts}_H$ but they shouldn't. In following example, $\text{pts}_H(\text{mapVar}.\text{values()}) = \text{pts}_H(\text{mapVar})$, but it can receive $\text{pts}_H$ from callee, so $\text{pts}_H(\text{mapVar})$ may contains irrelevant map objects from other context, making $\text{pts}(\text{item})$ contains more information from item1 and item2.
mapVar = new HashMap(); // o1
mapVar.put(key1, item1);
mapVar.put(key2, item2);

listVar = new ArrayList(); // o2
listVar.addAll(mapVar.values());

item = listVar.get(...)
  1. Array-Initializer such as Collections and Arrays$ArrayList <init> making objects point-to by array parameters cross-reference and affect the precision of ptsH of Collection variables. For example, the implementation of Collections.addAll is as follows. However, those algorithm is not marked ignored, so the side-effect of the method is conclude. Hence, for code Collections.addAll(list, array);, analysis algorithm may add more irrelevant information than array provide to list. A way to solve this is to mark Array-Initializer as ignorable, but the pts inside container will be unsound.
public static <T> boolean addAll(Collection<? super T> c, T... elements) {
        boolean result = false;
        for (T element : elements)
            result |= c.add(element);
        return result;
}

for-just-we and others added 3 commits January 12, 2026 14:40
- 1.`Map.entrySet().iterator().next()` shoule be deemed as [Transfer] instead of [Exit]. However, this is not easy to recogize in ContainerAccessHandler.CutReturnEdge. Making point-to set of Map.entrySet().iterator().next() call receiver to be empty. Currently we use a stupid method that collect variables to be `Map$Entry` type and recongize which `Iterator.next()` callsite is `Map$Entry` to filter.

- 2.PFG edge from return value of `Map.keySet()/values()` to their callsite receivers is marked with `FlowKind.LOCAL_ASSIGN` instead of `FlowKind.RETURN`. Making ptsH of `Map.keySet()/values()` not optimized.

- 3.`Array-Initializer` such as `Collections` and `Arrays%ArrayList <init>` making objects point-to by array parameters cross-reference and affect the precision of ptsH of `Collection` variables.
@github-actions
Copy link

github-actions bot commented Feb 16, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@for-just-we
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant