-
Notifications
You must be signed in to change notification settings - Fork 74
Open
Labels
enhancementNew feature or requestNew feature or request
Milestone
Description
Hi, in my case, I want to create a arrow file in client side, then pass to server side. But when I just try run writeArrowFeather
, will show the IndexOutOfBoundsException
issues.
Exception in thread "main" java.lang.IndexOutOfBoundsException: index: 31393, length: 2320 (expected: range(0, 32768))
at org.apache.arrow.memory.ArrowBuf.checkIndex(ArrowBuf.java:701)
at org.apache.arrow.memory.ArrowBuf.setBytes(ArrowBuf.java:765)
at org.apache.arrow.vector.BaseVariableWidthVector.setBytes(BaseVariableWidthVector.java:1244)
at org.apache.arrow.vector.BaseVariableWidthVector.set(BaseVariableWidthVector.java:1059)
at org.apache.arrow.vector.VarCharVector.set(VarCharVector.java:255)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriterImpl$infillVector$1.invoke(ArrowWriterImpl.kt:111)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriterImpl$infillVector$1.invoke(ArrowWriterImpl.kt:111)
at org.jetbrains.kotlinx.dataframe.api.ForEachKt.forEachIndexed(forEach.kt:34)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriterImpl.infillVector(ArrowWriterImpl.kt:111)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriterImpl.allocateVectorAndInfill(ArrowWriterImpl.kt:197)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriterImpl.allocateVectorSchemaRoot(ArrowWriterImpl.kt:223)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriter$DefaultImpls.writeArrowFeather(ArrowWriter.kt:114)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriterImpl.writeArrowFeather(ArrowWriterImpl.kt:61)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriter$DefaultImpls.writeArrowFeather(ArrowWriter.kt:125)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriterImpl.writeArrowFeather(ArrowWriterImpl.kt:61)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriter$DefaultImpls.writeArrowFeather(ArrowWriter.kt:133)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriterImpl.writeArrowFeather(ArrowWriterImpl.kt:61)
at org.jetbrains.kotlinx.dataframe.io.ArrowWritingKt.writeArrowFeather(arrowWriting.kt:89)
at com.phodal.chapi.arrow.MainKt.main(Main.kt:26)
Suppressed: java.lang.IllegalStateException: Memory was leaked by query. Memory leaked: (33024)
Allocator(ROOT) 0/33024/264192/9223372036854775807 (res/actual/peak/limit)
at org.apache.arrow.memory.BaseAllocator.close(BaseAllocator.java:437)
at org.apache.arrow.memory.RootAllocator.close(RootAllocator.java:29)
at org.jetbrains.kotlinx.dataframe.io.ArrowWriterImpl.close(ArrowWriterImpl.kt:247)
at kotlin.jdk7.AutoCloseableKt.closeFinally(AutoCloseable.kt:64)
at org.jetbrains.kotlinx.dataframe.io.ArrowWritingKt.writeArrowFeather(arrowWriting.kt:88)
... 1 more
FAILURE: Build failed with an exception.
Here is my demo code with writer and some debug information:
val dataFrame = DataFrame.read("https://raw.githubusercontent.com/phodal-archive/apache-arrow-chapi-demo/master/data/0_codes.json")
dataFrame.schema().print()
val toArrowSchema = dataFrame.columns().toArrowSchema()
println(toArrowSchema.toJson())
dataFrame.writeArrowFeather(File("codes.arrow"))
When i try to debug, in the dataFrame.schema().print()
, it will return correct schema:
NodeName: String
Module: String
Type: String
Package: String?
FilePath: String
Fields: *
TypeType: String
TypeKey: String
Modifiers: List<String>
TypeValue: String?
Annotations: *
Name: String
KeyValues: *
Key: String
Value: String
Implements: List<String>
Functions: *
Name: String
Package: String?
ReturnType: String
Parameters: *
TypeValue: String
TypeType: String
FunctionCalls: *
Package: String?
NodeName: String?
FunctionName: String
Position:
StartLine: Int
StartLinePosition: Int
StopLine: Int
StopLinePosition: Int
Parameters: *
TypeValue: String
TypeType: String
Type: String?
Position:
StartLine: Int
StartLinePosition: Int?
StopLine: Int
StopLinePosition: Int?
LocalVariables: *
TypeValue: String
TypeType: String
IsConstructor: Boolean?
Annotations: *
Name: String
KeyValues: *
Key: String
Value: String
Imports: *
Source: String
AsName: String
Position:
StartLine: Int?
StopLine: Int?
StartLinePosition: Int?
StopLinePosition: Int?
Annotations: *
Name: String
But, in dataFrame.columns().toArrowSchema()
the type will be error:
{
"fields" : [ {
"name" : "NodeName",
"nullable" : false,
"type" : {
"name" : "utf8"
},
"children" : [ ]
}, {
"name" : "Module",
"nullable" : false,
"type" : {
"name" : "utf8"
},
"children" : [ ]
}, {
"name" : "Type",
"nullable" : false,
"type" : {
"name" : "utf8"
},
"children" : [ ]
}, {
"name" : "Package",
"nullable" : true,
"type" : {
"name" : "utf8"
},
"children" : [ ]
}, {
"name" : "FilePath",
"nullable" : false,
"type" : {
"name" : "utf8"
},
"children" : [ ]
}, {
"name" : "Fields",
"nullable" : true,
"type" : {
"name" : "utf8"
},
"children" : [ ]
}, {
"name" : "Implements",
"nullable" : true,
"type" : {
"name" : "utf8"
},
"children" : [ ]
}, {
"name" : "Functions",
"nullable" : true,
"type" : {
"name" : "utf8"
},
"children" : [ ]
}, {
"name" : "Imports",
"nullable" : true,
"type" : {
"name" : "utf8"
},
"children" : [ ]
}, {
"name" : "Position",
"nullable" : true,
"type" : {
"name" : "utf8"
},
"children" : [ ]
}, {
"name" : "Annotations",
"nullable" : true,
"type" : {
"name" : "utf8"
},
"children" : [ ]
} ]
}
I lost something?
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request