Skip to content

Ruby: Completely different hashes are marked as similar #188

Closed
@Mange

Description

@Mange

Two completely different constant hashes are marked as similar code. There is no possible workaround for it as there's nothing more you can do to decrease this "similarity".

Here's a failing test case to show the problem:

diff --git a/spec/cc/engine/analyzers/ruby/main_spec.rb b/spec/cc/engine/analyzers/ruby/main_spec.rb
index 62ed1cd..c35aa84 100644
--- a/spec/cc/engine/analyzers/ruby/main_spec.rb
+++ b/spec/cc/engine/analyzers/ruby/main_spec.rb
@@ -149,6 +149,44 @@ module CC::Engine::Analyzers
           expect(run_engine(engine_conf)).to eq("")
         }.to output(/Skipping file/).to_stderr
       end
+
+      it "does not see hashes as similar" do
+        create_source_file("foo.rb", <<-EORUBY)
+          ANIMALS = {
+            bat: "Bat",
+            bee: "Bee",
+            cat: "Cat",
+            cow: "Cow",
+            dog: "Dog",
+            fly: "Fly",
+            human: "Human",
+            lizard: "Lizard",
+            owl: "Owl",
+            ringworm: "Ringworm",
+            salmon: "Salmon",
+            whale: "Whale",
+          }.freeze
+
+          TRANSPORT = {
+            airplane: "Airplane",
+            bicycle: "Bicycle",
+            bus: "Bus",
+            car: "Car",
+            escalator: "Escalator",
+            helicopter: "Helicopter",
+            lift: "Lift",
+            motorcycle: "Motorcycle",
+            rocket: "Rocket",
+            scooter: "Scooter",
+            skateboard: "Skateboard",
+            truck: "Truck",
+          }.freeze
+        EORUBY
+
+        issues = run_engine(engine_conf).strip.split("\0")
+
+        expect(issues.length).to eq(0)
+      end
     end
 
     describe "#calculate_points(mass)" do

This is a bug that affects a lot of our larger projects where we have different forms of lookup tables in the codebase. I can see reason to regard them as similar when the keys and/or values are similar/same; maybe even when the hashes are inverted (a: 1 -> 1 => :a), so this issue is not related to those cases.

But in this case, where two separate and completely unrelated hashes causes similarity lints to fail, will only guide developers to do bad PRs or to ignore CodeClimate problems in PRs.

I know the reason behind the issue (that the similarity engine only looks as the AST), but I was hoping that perhaps hashes could be made into an exception for this so we don't have to disable the otherwise useful similarity engine.

I'm also open to helping you come up with a patch that fixes the problem, if you are okay with this behavior being changed. I want to bring this up before spending time writing a patch that might just get rejected on the outset. :-)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions