Inference Configurations
Inference configurations of open-loop and closed-loop methods.
Closed-Loop inference
In closed-loop inference, main module (with CARLA) shares the same config file with VLM module.
Example
{
"TASK_CONFIGS": {
"FRAME_PER_SEC": 10
},
"INFERENCE_BASICS": {
"INPUT_WINDOW": 1, // frame count of given image input
"CONVERSATION_WINDOW": 1, // not used anymore, to be removed
"USE_ALL_CAMERAS": false, // true if use all cameras as input
"USE_BEV": false, // true if use bev as input
"NO_HISTORY_MODE": false // do not inherit context of previous VQAs
},
"CHAIN": { // for inference
"NODE": [19, 15, 7, 24, 13, 47, 8, 43, 50],
"EDGE": { // "pred": succ
"19": [24, 13, 8],
"15": [7, 8],
"7": [8],
"24": [13, 47],
"13": [47, 8, 43],
"47": [8],
"8": [43],
"43": [50],
"50": []
},
"INHERIT": { // inherit context from last frame
"19": [43, 7],
"15": [7]
},
"USE_GT": [24] // questions which use ground truth as answer
},
"CONTROL_RATE": 2.0, // intervene frequency of vlm
"MODEL_NAME": "api", // model name, please check out supported models
"MODEL_PATH": "../model_zoo/your_model", // model path
"GPU_ID": 0, // the gpu model runs on
"PORT": 7023, // web port
"IN_CARLA": true,
"USE_BASE64": true, // if false, local path is used for transmitting images
"NO_PERC_INFO": false // do not pass extra perception info to vlm via prompt
}
Fields
TASK_CONFIGS
- FRAME_PER_SEC (type:
int
): The sampling rate of the sensors.
INFERENCE_BASICS
-
INPUT_WINDOW (type:
int
): Frame count of given visual input. -
CONVERSATION_WINDOW (type:
int
): Not used anymore, to be removed. Used to determine context length. -
USE_ALL_CAMERAS (type:
bool
): True if use all 6 cameras as input. -
USE_BEV (type:
bool
): True if use BEV image as input. -
NO_HISTORY_MODE (type:
bool
): True if do not inherit context of previous VQAs.
CHAIN
Please refer to graph configs.
Other Fields
-
CONTROL_RATE (type:
float
): Intervene frequency of VLM -
MODEL_NAME (type:
str
): Model name, please check out supported models inmodels/register.py
, or refer to VLM adapting tutorial. -
MODEL_PATH (type:
str
): Path to the folder of your model downloaded from huggingface. -
GPU_ID (type:
int
): The gpu your VLM runs on. -
PORT (type:
int
): The web port which used to connect CARLA and VLM. -
IN_CARLA (type:
bool
): An environment variable which must set to true if doing closed-loop evaluation. -
USE_BASE64 (type:
bool
): If true, base64 is used for transmitting images. Otherwise, local path is used. -
NO_PERC_INFO (type:
bool
): If true, extra perception info won't be passed to the VLM via text prompt.
When CARLA and the VLM are running on separate machines, you may enable USE_BASE64
and apply port forwarding to redirect the VLM's port to the local environment.
Open-Loop inference
Example
{
"TASK_CONFIGS": {
"INFER_SUBSET": false,
"USE_CHECKPOINT": true,
"SUBSET_FILE": "./infer_configs/subset.txt",
"CHECKPOINT_FILE": "./infer_configs/finished_scenarios.txt",
"ENTRY_EXIT_FILE": "./infer_configs/entry_exits.json",
"FRAME_PER_SEC": 10
},
"INFERENCE_BASICS": {
"INPUT_WINDOW": 1,
"CONVERSATION_WINDOW": 2,
"USE_ALL_CAMERAS": true,
"NO_HISTORY_MODE": false,
"APPEND_QUESTION": true,
"APPENDIX_FILE": "./infer_configs/append_questions.json"
},
"CHAIN": {
"NODE": [43, 50],
"EDGE": {
"43": [50],
"50": []
},
"INHERIT": {
"19": [43, 7],
"15": [7]
},
"USE_GT": []
}
}
Fields
TASK_CONFIGS
-
INFER_SUBSET (type:
bool
): Whether inference a subset of given dataset or not. -
USE_CHECKPOINT (type:
bool
): Whether use a checkpoint file to record the process of inference, so that B2DVL only inference scenarios which are not in the checkpoint file. -
SUBSET_FILE (type:
str
): Path to the subset file. You can leave blank if not used. -
CHECKPOINT_FILE (type:
str
): Path to the checkpoint file. You can leave blank if not used. -
ENTRY_EXIT_FILE (type:
str
): Path to the entry exit file, this is the file which specifies entry and exit point of certain scenarios. -
FRAME_PER_SEC (type:
int
): The sampling rate of the sensors.
INFERENCE_BASICS
-
INPUT_WINDOW (type:
int
): Frame count of given visual input. -
CONVERSATION_WINDOW (type:
int
): Not used anymore, to be removed. Used to determine context length. -
USE_ALL_CAMERAS (type:
bool
): True if use all 6 cameras as input. -
USE_BEV (type:
bool
): True if use BEV image as input. -
NO_HISTORY_MODE (type:
bool
): True if do not inherit context of previous VQAs. -
APPEND_QUESTION (type:
bool
): True if using appended questions. -
APPENDIX_FILE (type:
str
): Path to appended question file, which contains extra questions you defined which is not included in supported questions.
CHAIN
Please refer to graph configs.
Subset File
A file simply consists of scenario folders' names. If subset file is used, B2DVL will only infer scenarios which are mentioned in this file.
An example:
AccidentTwoWays_Town12_Route1102_Weather10
AccidentTwoWays_Town12_Route1103_Weather11
Accident_Town03_Route101_Weather23
Accident_Town03_Route102_Weather20
BlockedIntersection_Town03_Route134_Weather3
BlockedIntersection_Town03_Route135_Weather5
ConstructionObstacleTwoWays_Town12_Route1080_Weather14
ConstructionObstacleTwoWays_Town12_Route1083_Weather9
ConstructionObstacle_Town03_Route60_Weather8
ConstructionObstacle_Town03_Route61_Weather9
ControlLoss_Town04_Route169_Weather13
ControlLoss_Town04_Route170_Weather14
CrossingBicycleFlow_Town12_Route1011_Weather23
CrossingBicycleFlow_Town12_Route1012_Weather23
DynamicObjectCrossing_Town01_Route1_Weather1
DynamicObjectCrossing_Town01_Route2_Weather2
EnterActorFlow_Town03_Route132_Weather2
EnterActorFlow_Town04_Route192_Weather10
HardBreakRoute_Town01_Route30_Weather3
HardBreakRoute_Town01_Route31_Weather5
HazardAtSideLaneTwoWays_Town12_Route1128_Weather10
HazardAtSideLaneTwoWays_Town12_Route1129_Weather11
HazardAtSideLane_Town03_Route105_Weather22
HazardAtSideLane_Town03_Route106_Weather23
HighwayCutIn_Town06_Route298_Weather20
HighwayCutIn_Town06_Route299_Weather13
HighwayExit_Town06_Route291_Weather5
HighwayExit_Town06_Route292_Weather14
InterurbanActorFlow_Town06_Route294_Weather8
InterurbanActorFlow_Town06_Route314_Weather2
InterurbanAdvancedActorFlow_Town06_Route301_Weather15
InterurbanAdvancedActorFlow_Town06_Route302_Weather21
InvadingTurn_Town02_Route95_Weather9
InvadingTurn_Town02_Route99_Weather21
LaneChange_Town06_Route277_Weather9
LaneChange_Town06_Route307_Weather21
MergerIntoSlowTrafficV2_Town12_Route1009_Weather21
MergerIntoSlowTrafficV2_Town12_Route1010_Weather22
MergerIntoSlowTraffic_Town06_Route317_Weather5
MergerIntoSlowTraffic_Town12_Route1003_Weather8
NonSignalizedJunctionLeftTurnEnterFlow_Town12_Route1022_Weather8
NonSignalizedJunctionLeftTurnEnterFlow_Town12_Route1035_Weather21
NonSignalizedJunctionLeftTurn_Town03_Route122_Weather26
NonSignalizedJunctionLeftTurn_Town03_Route123_Weather26
NonSignalizedJunctionRightTurn_Town03_Route126_Weather18
NonSignalizedJunctionRightTurn_Town04_Route184_Weather2
OppositeVehicleRunningRedLight_Town03_Route119_Weather12
OppositeVehicleRunningRedLight_Town03_Route120_Weather8
OppositeVehicleTakingPriority_Town03_Route128_Weather23
OppositeVehicleTakingPriority_Town03_Route155_Weather25
ParkedObstacleTwoWays_Town12_Route1158_Weather14
ParkedObstacleTwoWays_Town12_Route1159_Weather23
ParkedObstacle_Town03_Route103_Weather25
ParkedObstacle_Town03_Route147_Weather0
ParkingCrossingPedestrian_Town12_Route758_Weather3
ParkingCrossingPedestrian_Town12_Route759_Weather5
ParkingCutIn_Town12_Route1300_Weather13
ParkingCutIn_Town12_Route1301_Weather14
ParkingExit_Town12_Route1305_Weather18
ParkingExit_Town12_Route1307_Weather20
PedestrianCrossing_Town12_Route1013_Weather25
PedestrianCrossing_Town12_Route1014_Weather0
SignalizedJunctionLeftTurnEnterFlow_Town12_Route1019_Weather5
SignalizedJunctionLeftTurnEnterFlow_Town12_Route1020_Weather6
SignalizedJunctionLeftTurn_Town03_Route113_Weather26
SignalizedJunctionLeftTurn_Town03_Route114_Weather6
SignalizedJunctionRightTurn_Town03_Route118_Weather14
SignalizedJunctionRightTurn_Town03_Route151_Weather2
StaticCutIn_Town03_Route109_Weather1
StaticCutIn_Town03_Route110_Weather6
TJunction_Town01_Route90_Weather12
TJunction_Town01_Route91_Weather13
VanillaNonSignalizedTurnEncounterStopsign_Town03_Route143_Weather13
VanillaNonSignalizedTurnEncounterStopsign_Town03_Route144_Weather14
VanillaSignalizedTurnEncounterGreenLight_Town03_Route137_Weather7
VanillaSignalizedTurnEncounterGreenLight_Town03_Route139_Weather9
VanillaSignalizedTurnEncounterRedLight_Town03_Route140_Weather10
VanillaSignalizedTurnEncounterRedLight_Town03_Route141_Weather11
VehicleOpensDoorTwoWays_Town12_Route1196_Weather0
VehicleOpensDoorTwoWays_Town12_Route1197_Weather1
VehicleTurningRoutePedestrian_Town12_Route1027_Weather13
VehicleTurningRoutePedestrian_Town12_Route1040_Weather0
VehicleTurningRoute_Town12_Route1026_Weather12
VehicleTurningRoute_Town12_Route822_Weather18
YieldToEmergencyVehicle_Town03_Route148_Weather18
YieldToEmergencyVehicle_Town04_Route165_Weather7
This example consists of exactly two route scenarios in each scenario class.
Checkpoint File
Checkpoint file shares the same grammar as subset file. For each line, if this line is a substring of current scenario's folder path, this scenario is skipped.
Entry Exit File
In some scenarios, you might not want the VLM to infer all of its frames. You can use an entry exit file to configure a interval. For example:
{
"Accident_Town03_Route101_Weather23": {
"entry": 40,
"exit": 120
},
"YieldToEmergencyVehicle_Town03_Route148_Weather18": {
"entry": 5,
"exit": 150
},
"ConstructionObstacleTwoWays_Town12_Route1080_Weather14": {
"entry": 15,
"exit": 140
},
"ConstructionObstacle_Town03_Route60_Weather8": {
"entry": 15,
"exit": 150
},
"CrossingBicycleFlow_Town12_Route1062_Weather22": {
"entry": 330,
"exit": 370
},
"HardBreakRoute_Town01_Route30_Weather3": {
"entry": 10,
"exit": 80
},
"ParkedObstacleTwoWays_Town12_Route1158_Weather14": {
"entry": 20,
"exit": 245
}
}
This file specifies the starting point and the endding point of 7 scenarios. When infering them, only frames within [entry, exit)
is infered. For scenarios which are not mentioned, all frames are infered.
Appendix File
If you want to add extra questions for VLM to answer, you can write an appendix file to define more questions. But remember to pick unused qid. Since customized questions are not supported by DriveCommenter, they can not be evaluated by our evaluation module. You have to do some extra work to evaluate them.
An example:
[
{
"Q": "According to the current situation, what should the ego vehicle do?",
"A": "To be simulated",
"C": null,
"qid": 52,
"con_up": [
[
-1,
-1
]
],
"con_down": [
[
-1,
-1
]
],
"cluster": -1,
"layer": -1,
"object_tags": [
]
},
{
"Q": "According to the current situation, which direction should the ego vehicle go? A) Go straight along the current lane. B) Change to the left lane. C) Change to the right lane. D) Steer left. E) Steer right. If the answer is in D or E, try to provide the precise angle for steering.",
"A": "To be simulated",
"C": null,
"qid": 53,
"con_up": [
[
-1,
-1
]
],
"con_down": [
[
-1,
-1
]
],
"cluster": -1,
"layer": -1,
"object_tags": [
]
},
{
"Q": "According to the current situation, at what speed should the ego vehicle drive? A) Remain current speed. B) Accelerate. C) Decelerate. D) Stop. After select these options, try to provide precise target speed.",
"A": "To be simulated",
"C": null,
"qid": 54,
"con_up": [
[
-1,
-1
]
],
"con_down": [
[
-1,
-1
]
],
"cluster": -1,
"layer": -1,
"object_tags": [
]
}
]